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PRESIDENT'S  REPORT  ON  1981 
IASSIST— IFDO  CONFERENCE 


Alice  Robbin 


This  is  a  report  on  a  planning  meeting  for  the  1981 
lASSIST — IFDO  Conference  that  I  attended  in  my  capacity 
as  President  of  lASSIST.  The  report  summarizes  agreements 
on  the  objectives  of  the  Conference,  its  contents,  audiences, 
and  lASSlST's  role  and  responsibilities 

On  October  6lh  and  7th,  1  attended  the  meeting  in  Paris, 
France  My  trip  was  funded  by  the  National  Science 
Foundation  as  part  of  its  program  to  help  international 
organizations  sponsor  and  plan  professional  society  meetings 
The  interests  of  lASSIST  correspond  rather  well  with  the 
objectives  of  the  Foundation's  travel  grants  and,  in  particular, 
with  the  French-American  program  of  the  International 
Program  Division  In  attendance  was  the  Executive  Planning 
Committee,  composed  of  French  members,  F.  BON  and  B. 
BOUET  of  the  University  of  Grenoble,  J  -P  GREM  Y  of  the 
University  Descartes  (and  also  a  member  of  lASSIST),  and 
myself.  G.  MARTINOTTI,  President  of  IFDO,  was  unable  to 
attend  The  meeting  was  a  delight,  and  we  worked  steadily 
with  few  interruptions  for  hours  on  end,  all  of  us  committed 
to  completing  a  long  agenda  of  items  and  each  recognizing 
that  we  had  very  little  time  to  exchange  information.  I  am 
most  grateful  for  the  careful  pre-meeting  planning  and 
enthusiasm  that  the  French  members  of  the  Executive 
Committee  demonstrated  and  the  good  humor  that  reigned 
throughout  the  long  days. 

Here  are  the  particulars: 

TITLE: 

The  Impact  of  Computerization  on  Social  Science  Research: 
Data  Services  and  Technological  Developments/  L'impact  de 
rinformatique  sur  les  recherches  en  science  sociales:  les 
banqucs  de  donnees  et  les  developpements  technologiques 

DATE: 

September  14-18,  1981 

PLACE: 

University  of  Grenoble,  Grenoble,  France 

CONFERENCE  ORGANIZERS: 

lASSIST,     IFDO,     Laboratoire    d'lnformalique    pour  les 

Sciences    de    I' Homme    (LISH)/    Centre    National    de  la 

Recherche  Scientifique  (CNRS),  and  the  University  of 
Grenoble 


WORKING  LANGUAGES: 

English  and  French  (Sessions  will  have  both  French  and 
English  speaking  professionals  who  will  act  as  infonnal 
intermediaries.) 

REGISTRATION  FEE: 

$.^5  US.  or  150  FF. 
PRELIMINARY  SCHEDULE: 

•  End  of  October:  'Call  for  Papers'  (By  this  time  all 
lASSIST  members  should  have  received  a  copy.) 

•  Middle  of  February  1981:  Intention  to  submit  a  paper 
should  be  submitted  to  the  French  Planning  Commitee. 

•  Beginning  of  April  1981:  Abstract  of  paper  submitted  to 
the  French  Planning  Committee  (no  more  than  500 
words). 

Begirning  of  May   1981:  Notification  of  acceptance  of 
papei 

•  End  of  June — Beginning  of  July  1981:  Preliminary 
Agenda  for  the  Conference 

This  schedule  should  allow  ample  time  for  lASSIST  members 
to  decide  whether  they  want  to  attend  the  Grenoble  meetings 
and  to  apply  for  institutional  and/or  extramural  funding 
support.  (And  also  to  review  our  French) 


CONFERENCE  AGENDA: 


Plenary  Session:  Title:  Computers  and  Information:  Their 
Societal  Impact 


New  Types  of  Research  will  include  the  processing  of 
large  ecological  files  and  survey,  textual,  historical,  and 
biographical  data,  complex  data  (historical,  biographical, 
time  series,  genealogical,  topological,  and  textual):  their 
creation,  structure,  administration,  and  preservation,  and 
problems  ansing  from  their  exchange;  secondary 
analysis;  computer  mapping,  and  problems  of 
aggregation  and  disaggregation 

New  Institutions  will  include  development  of  data  banks 
and  their  perspectives;  politics  of  creating  and 
disseminating  data,  longterm  storage,  sociological 
approaches  to  the  data  worid  (producers,  users,  services); 
information  systems  about  data  and  machine  readable 
Files. 
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New  Tools  will  include  networks,  shared  and  distributed 
data  bases,  on-line  data  bases;  micro-  and  mini- 
computers; data  base  management  systems  for  complex 
data  bases  and  perspectives  on  the  application  of  artificial 
intelligence  and  knowledge  representation;  compatibility 
(through  translation.  Pivots  languages),  and  scienlific 
information  retrieval - 

'     Relations  between  data  producers  and  researchers 

will  include  data  description  and  documentation,  user 
needs  for  information  and  data,  economics  of  data 
services;  information  on  organisational  sources  for  public 
and  commercial  data;  political,  organizational,  and 
technical  problems  associated  with  process-produced  data 
(public/administrative  records),  public  opinion  polls; 
their  quality,  effects  on  political  decision-making,  access 
conditions;  anonymization  of  data  (and  deductive 
disclosure). 

As  you  can  see  this  is  a  broad  methodological,  technical, 
research  and  policy  agenda  for  the  Conference.  It  permits 
wide  participation  by  members  of  lASSIST,  who  can  define 
their  own  contributions  within  the  proposed  topics  The 
breadth  of  subjects  is  indicative  of  the  wide  range  of  interests 
by  members  of  lASSIST.  IFDO,  and  the  French  social 
scientific  community 


cartographic,  and  the  forth  on  satellite  data) 

Topic  2:  The  Organization  and  Management  of  Data 
Services.  This  is  viewed  as  a  working  session,  lasting 
either  a  half  or  a  whole  day  (in  conjunction  with 
Topic  ?i).  during  which  time  at  least  one  individual 
will  discuss  the  political,  economic,  and 
administrative  problems  associated  with  the 
organization  and  management  of  data  services; 
budgeting;  personnel  recruitment  and  training  (but 
not  to  be  confused  with  the  next  topic);  building  a 
collection;  user  services;  relationships  of  personnel  to 
users,  software  and  computer;  maintenance  and 
perservalion;  user  needs,  linkages  to  other 
information  services,  etc. 

Topic  3:  The  Formation  or  a  Professional  Data  Archivist 
and  Librarian.  This  is  viewed  as  a  working  session, 
lasting  either  a  half  or  a  whole  day  (in  conjunction 
with  Topic  2).  during  which  time  at  least  one 
individual  will  discuss  problems  in  recruitment  and 
training  of  data  services'  personnel,  based  on 
previous  experience,  and  knowledge  necessary  to 
provide  data  services  Hands-on  experience  in  the 
various  aspects  of  understanding  data  and  providing 
data  services  will  be  provided. 


WORKSHOP  AGENDA: 

A  one-day  series  of  workshops  devoted  to  three  topics  of 
substantial  interest  to  lASSIST  members  and  the  French 
social  research  community  and  government  agencies  has  been 
proposed  The  workshops  will  be  the  responsibility  of 
lASSIST  and  its  members  lASSIST  members  should 
immediately  signal  their  interest  in  participating  in  a 
workshop,  submit  an  abstract  of  the  objectives  of  their 
contribution,  and  as  extensive  as  possible  an  outline  of  the 
proposed  discussion  (including  the  estimated  length  of  time 
required  for  the  presentation)  Copies  of  the  abstract  and 
outline  should  be  sent  to  lASSIST  1481  workshop  chair. 
Lame  Ruus.  and  members  of  the  Program  Committee 
(Robbin.  McManus.  Rowe.  Von  Brunken.  Gavrel,  and 
Schrik.  addresses  found  below).  lASSIST  will  attempt  to 
secure  travel  and  lodging  suppori  for  those  individuals 
selected  to  participate  in  these  workshops  (at  the  same  time, 
individuals  should  attempt  to  locale  matching  funds)  The 
Program  Committee  will  develop  an  extensive  outline  of  the 
contents  of  each  of  the  workshops,  to  be  forwarded  to  various 
funding  agencies  This  outline  should  be  completed  by  the 
middle  of  December;  therefore,  lASSIST  members  wishing 
to  participate  m  these  workshops  should  contact  lASSIST 
Program  Committee  members  and  the  President  as  soon  as 
possible. 

Topic  I:  The  Assay/Evaluation  of  Survey,  Ecological, 
Cartographic,  and  Satellite  Data.  This  is  viewed  as 
a  working  session  (read  "technical"').  lasting  an 
entire  day.  during  which  lime  four  individuals  will 
present  the  conceplu.il  and  methodological  activities 
required  to  evaluate  the  quality  of  data  (one  person  on 
survey  data,  the  second  on  ecological,  the  third  on 


EXPOSITION/DEMC  NSTRATION    DURING    THE 
CONFERENCE: 

Three  areas  of  interest  have  been  identified,  and  others  are 
sought. 

Build  the  exposition  around  minicomputing  and  social 
research:  display  of  materials  created  and  used  by 
lASSIST  and  IFDO  members,  with  demonstrations 
during  the  conference 

•  Invite  data  base  vendors  (e.g..  VIEWDATA)  to 
demonstrate  their  wares  to  the  social  science  community 

■  Invite  the  network  people  of  EURONET  and  regional  and 
international  network  organizations  (e  g  .  CHRONOS 
Data  Bank  of  the  EEC)  to  demonstrate  on-line  access  and 
relneval 


FUNDING 

The  cost  of  attending  this  conference  should  not  be 
minimized  In  an  effort  to  assist  social  and  information 
scientists'  pariicipation  in  the  Grenoble  conference,  a  variety 
of  funding  strategies  are  suggested.  First,  as  President  of 
lASSIST,  I  have  made  application  to  the  U.S.  National 
Science  Foundation  for  a  group  travel  grant  on  behalf  of  US 
social  and  information  scientists  If  awarded,  the  grant  would 
provide  I."^  U.S.  scientists  with  air  travel  between  their  home 
institution  and  the  conference  (an  average  grant  of  $884) 
The  award  will  be  competed  for  and  is  open  to  all  US 
scientists.  Selection  criteria  include  a  good  distribution 
between  young  scientists  whose  professional  development 
will  profit  by  attendance  and  experienced  scientists  who  are 
knowledgeable      abt>ut      social      science      information 
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infras(ruclure  development  and  can  conlnbute  lo  cooperative 
social  science  research  activity;  good  geographic  and 
institutional  distribution,  and  a  tnix  of  conlnbuted  papers, 
invited  papers,  and  w9rkshop  organizers  The  Selection 
Committee  is  composed  of  social  and  information  scientists 
who  have  long  been  involved  in  efforts  to  develop  social 
science  data  services  and  in  organizing  access  lo  machine- 
readable  data.  The  Committee  is  chaired  by  Dr  Joseph  W. 
Duncan,  Director  of  the  US  Office  of  Federal  Statistical 
Policy  and  Standards  Should  the  grant  be  awarded,  lASSIST 
members  and  other  social  scientists  will  be  immediately 
informed  through  their  professional  journals  Instructions  will 
be  supplied  on  to  whom  and  in  what  form  your  proposed 
paper  should  be  described  The  grant  proposal  will  be 
submitted  by  January  1 ,  1981  1  hope  that  a  decision  will  be 
forthcoming  by  April  Isl  and  that  the  Selection  Committee 
will  make  its  final  decision  by  the  end  of  May  You  are  very 
much  encouraged  lo  respond  immediately  to  the  'Call  for 
Papers'  (directly  to  Frederic  Bon)  I  will  be  kept  informed  by 
the  French  organizers  about  the  lASSIST  members'  and  other 
social  and  information  scientists'  interest  in  participating  in 
the  Grenoble  conference.  Should  you  have  any  questions, 
please  feel  free  to  call  me 

Second,  lASSlST  members  should  make  every  effort  to 
secure  institutional  or  extramural  funding  support  if  they  wish 
to  attend  this  Conference  National  research  funding  agencies 
(e.g..  Social  Science  Research  Councils  and  the  like)  have 
limited  amounts  of  funding  for  individual  travel  grants.  For 
example,  the  U.S.  National  Science  Foundation  has  a  limited 
budget  for  individual  awards  to  individuals  who  have  been 
Invited  to  organize  a  special  session  or  lecture  at  a  plenary 
session  of  an  international  scientific  meeting  Other  national 
funding  bodies  have  funds  for  individuals  who  will  give 
papers/communications  in  sessions  at  international  scientific 
meetings.  Since  there  Is  ample  time  to  make  special  requests 
for  institutional  support,  this  strategy  should  be  employed. 

Third,  an  effort  will  be  made  to  Interest  data  processing  and 
minicomputer  manufacturers  to  assist  in  the  conference.  All 
ideas  on  this  front  are  solicited  from  the  members  of 
lASSIST  Please  contact  me  if  you  have  any  thoughts  on  this 
matter. 


lASSIST  IS  an  iniernaiional  organization  It  was  organized  lo 
refiecl  the  needs  of  professionals  involved  in  providing  data 
services.  Also  included  in  its  mandate  is  the  objective  of 
providing  the  expertise  from  among  Us  members  to  assist  In 
developing  data  services  in  countries  where  this  need  has 
been  articulated.  The  French  scholarly  community  has 
requested  our  assistance  in  the  planning  and  in  the  success  of 
this  Conference  It  behooves  each  of  us  to  diffuse  the 
knowledge  and  experience  we  have  gained  in  providing  data 
services.  This  international  Conference  will  allow  us  to  do 
just  that.  1  look  eagerly  for  your  enthusiasm  and  commitment 
to  making  this  Conference  a  success. 


lASSIST  1981  Workshop  Chair 

Lalne  Ruus,  Data  Library.  Computing  Centre.  University  of 
Bntish  Columbia,  2075  Wesbrook  Place,  Vancouver,  British 
Columbia,  Canada  V6T  IW5 


Program  Committee  Members  (lASSIST) 

1 .  Alice  Robbin.  President.  lASSIST.  Oata  and  Program 
Library  Service.  University  of  Wisconsin — Madison, 
4452  Social  Science  Building,  Madison,  Wisconsin 
53706  (tel.  no.:  (608)  262-7962) 

2.  Sue  Gavrel,  Machine  Readable  Archives,  Public  Archives 

of  Canada.  395  Wellington  Street.  Ottawa.  Ontario  KIA 
0N4  Canada 

3.  Judith  Rowe.  Computer  Center.  Princeton  University.  87 

Prospect  Avenue.  Princeton.  New  Jersey  08544.  USA 

4.  Henk  Schnk.  Slelnmetz  Archives.  Herengracht  410-412, 

1017  BX  Amsterdam,  The  Netherlands 

5.  Nancy  McManus,  Social  Science  Research  Council,  1755 

Massachusetts   Ave.,    N.W.    Washington,    DC.    20036 
USA 

6.  Erika  Von  Brunken,  Medical  Information  Center, 
Karollnska  Institutel,  S-104  01  Stockholm.  Sweden 
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A  STRATEGY  FOR  ARCHIVING 

GOVERNMENT  DATA  TO  MEET  THE  NEEDS 

OF  THE  RESEARCH  COMMUNITY 


Dr.  Jake  Knoppers 

Senior  Advisor  (Information  Management) 
Public  Archives  of  Canada 


BACKGROUND 


Govemmenl  Administration  and  Survey  Data  as  a 
Research  Resource 

By  far  the  greatest  coHeclor  and  user  of  information  is  the 
government  The  largest  portion  of  the  data  collected  is  for 
administrative  purposes,  eg.  collection  of  taxes;  distribution 
of  socio-economic  benefits;  regulation  of  industry,  trade  and 
commerce,  etc  As  a  result  of  these  activities,  the  government 
builds  and  maintains  enormous  stores  of  information  or  data 
banks.  This  information  is  collected  continuously  albeit  with 
some  periodicity,  eg   annual  tax  returns,  monthly  filings 

In  addition  to  the  information  which  is  collected  pursuant  to 
Acts  of  Parliament  and  initiatives  of  the  government,  the 
various  departments  also  carry  out  innumerable  surveys,  or 
commissijn  surveys  and  opinion  polls  to  be  earned  out  by 
third  parties. 

Altogether,  Statistics  Canada  has  identified  2,352  major  data 
banks  so  far,  including  both  manual  and  machine  readable. 
(1)  Some  data  relates  to  individuals  while  other  data  refers  to 
larger  aggregates  such  as  firms,  households  and  national 
accounts  The  number  of  subjects  in  each  data  bank  ranges 
from  as  few  as  200  to  as  high  as  21 ,500,000. 


As  a  matter  of  fact,  there  is  a  growing  recognition  that  the 
various  departments  of  govemnient  may  have  gone  too  far  in 
their  mynad  data  collection  activities  The  government  has 
responded  to  these  concerns  through  the  "reduction  of 
paperburden"  initiative  This  finds  expression  in  such 
management  tools  as  forms  control,  data  collection  approval 
mechanisms,  and  data  locator  systems. 

Perhaps  the  desire  to  reduce  paperburden,  while  meeting  the 
needs  of  the  research  community,  can  be  dealt  with  more 
effectively  if  one  institutes  a  new  strategy  for  archiving  these 
government  data  in  a  rational  and  consistent  manner. 


The  Needs  of  the  Research  Community 

Because  of  the  nature  of  the  research  scientist's  work,  his 
concern  for  access  to  government  records  is  both  highly 
selective  and  strongly  motivated  Neither  the  research 
scientist  nor  an  informed  public  can  rely  solely  on  the 
summary  tables  produced  by  government  departments  and 
statistical  agencies  Nor  do  such  tables  of  aggregated 
statistics  take  into  consideration  the  concerns  of  all  scholars. 
It  must  be  possible  for  the  public  and  the  research  community 
10  obtain  the  raw  micro-data,  conduct  independent  analyses, 
and  draw  their  own  conclusions 

Since  the  major  portion  of  the  data  collected  by  government 
for  administrative  purposes  involves  privacy  considerations, 
ways  must  be  found  to  balance  the  freedom  to  do  research 
against  the  individual's  (including  legal  persons')  right  to 
pnvacy  This  principle  is  stated  quite  succinctly  in  a 
resolution  of  the  Social  Science  Federation  of  Canada,  passed 
at  their  May  17,  1979  meeting   The  resolution  reads: 


The  opinions  expressed  in  this  paper ; 
of  Canada,  for  which  he  prepared  it  : 


"There  are  socially  significant  fields  of  research  for 
which  access  to  personal  records  is  indispensable. 
There  is,  therefore,  a  need  to  use  personal  data  held 
by  government  agencies,  for  statistical  and  research 
purposes,  in  order  to  promote  scientific 
understanding  of  important  contemporary  problems. 
This  use  of  govemmenl  data  is  not  incompatible  with 
the  need  to  protect  the  privacy  of  individuals. 
Therefore,  any  federal  or  provincial  laws  for  the 
protection  of  personal  privacy  or  for  access  to 
government  documents  should  make  a  clear 
distinction  between  administrative  or  regulatory  uses 
of  personal  information,  which  directly  affect  a 
person,  and  statistical  or  research  uses,  which  do  not, 
and  should  explicitly  recognize  the  legitimacy  of 
using  personal  data  for  statistical  or  research 
purposes  Accordingly,  provisions  in  these  laws 
should  set  out  the  right  of  researchers  to  obtain  access 
to  personal  data  under  specified  conditions,  and 
should  specify  these  conditions,  the  most  important 
being  a  wntten  undertaking  not  to  reveal  data  on 
specific   individuals   without  their  express  consent 

those  of  the  author,  and  are  not  intended  to  reflect  the  position  of  the  Public  Archives 
a  consultant,  nor  that  of  any  other  Canadian  govemmenl  department  or  agency 
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Such  laws  bhould  also  provide,  rn  case  such  access  is 
refused,  a  right  of  appeal  lo  an  independent  authority, 
such  as  an  ombudsman,  or  a  court,  or  preferably 
both."  (2) 

Finally,  the  data  needs  of  the  research  community  and  those 
of  the  government  as  vvcll  would  be  met  if  it  were  possible  lo 
develop  a  longitudinal,  rational,  and  integrated  master 
sampling  frame  of  govemmeni  administrative  records  The 
creation  of  such  a  mechanism  and  entity  would  also  lead  lo  a 
high  probability  of  reduction  of  paperburden  and  extraneous 
survey  data  collection. 

The  Role  and  Responsibility  of  (he  Public  Archives  o( 
Canada 

The  broad  mandate  of  the  Public  Archives  of  Canada  is  lo 
collect  public  (i.e  .  government)  records,  documents  and 
other  historical  material  of  every  description  which  reflect 
Canadian  society  from  iis  beginnings  to  the  present  day  The 
Public  Archives  has  the  overall  control  responsibility  for  the 
"life-cycle"  of  all  records,  once  their  creation  has  been 
approved.  The  components  of  the  "life-cycle"  of  records 
include: 

Identification,  registration  and  classification;  care  and 
custody;  controlled  circulation;  the  esiablishmeni  of 
retention  and  disposal  schedules;  and  final  disposal 
through  destruction  or  permanent  retention 

Records  include  all  diferent  kinds  of  physical  storage 
media  for  information,  eg  .  paper,  electronic,  film. 


It  is  the  responsibility  of  the  Dominion  Archivist  to  identify 
and  appraise  all  govemmeni  records  as  to  their  historical  or 
research  value,  granting  permission  for  the  destruction  of 
those  records  which  are  of  no  archival  value  and  thus  not 
eligible  for  permanent  retention  The  gathering  and 
preserving  of  archival  records  is  done  with  the  purpose  of 
making  them  available  to  the  govemmeni.  researcher  and  the 
public. 

The  role  and  responsibility  of  the  Public  Archives  to  the 
research  community  is  that  the  desired  micro-data  found  in 
govemmeni  administrative  records  and  surveys  is  preserved 
for  use.  An  Advisory  Council  on  Public  Records  in  which  the 
social  sciences  are  represented  provides  a  vehicle  whereby 
the  research  community  can  make  its  wishes  known.  O) 


THE  POLITICAL  AND  LEGAL  ENVIRONMENT 

Present  Legislation  and  Public  Attitudes 

One  of  the  rights  which  has  been  recognired  in  law  in  many 
countries  is  the  right  to  information  privacy,  i.e..  that 
individuals  have  the  right  lo  be  informed  about  the  storage  of 
personal  data  about  themselves  and  lo  control  the  collection, 
use,  and  dissemination  of  such  information  This  definition  is 
grounded  on  the  conviction  that,  in  the  end.  information 
about  individuals  belongs  to  the  individuals  themselves  — 
indeed,   such    information    is   the  essence   of  individuality. 


While  recognizing  that  an  individual  divulges  information 
about  himself  for  a  purpose  —  in  exchange  for  a  good, 
service  or  benefit,  or  as  required  under  law  —  this  approach 
holds  that  the  information  is  nonetheless  his,  and  thai  he 
retains  rights  with  respect  to  it. 

In  general,  with  respecl  to  the  operations  of  government, 
privacy  legislation  is  designed  to  enable  individuals  lo  control 
how,  when,  and  to  what  extent  information  about  them  is 
communicated  to  others — especially  where  such 
communications  are  related  to  the  decision-making  or 
administrative  processes  which  affect  them  personally. 
Consequently,  privacy  legislation  normally  contains 
provisions  designed  to  control  national  or  federal  data  banks, 
with  controls  on  collection,  storage,  dissemination,  retention 
and  corrections  of  personal  information.  In  many  countries, 
the  exercise  of  the  rights  under  privacy  is  assisted  through 
requirements  to  list  or  report  on  all  personal  data  collection 
activities  The  emphasis  in  many  countries  is  on  the 
surveillance  of  compuierized  rather  than  manually  recorded 
information  Much  of  the  pressure  for  greater  openness  in 
govemmeni  has  been  relieved  by  legislation  on  privacy,  even 
though  this  aspect  has  been  treated  separately  from  disclosure 
of  govemmeni  information  of  the  more  general  kind. 

In  Canada,  privacy  legislation  finds  its  expression  in  Part  IV 
of  the  Canadian  Human  Rights  Act  It  requires  that  after 
March  I .  I98t)  the  government  shall  inform  data  sources  who 
knowingly  provide  information  to  a  govemment  institution, 
during  the  course  of  collection,  the  purpose  and  use  to  which 
the  information  will  be  put  (4)  This  rider  applies  specifically 
lo  administrative  uses  of  information  for  decision-making 
purposes  which  impact  directly  on  the  individual  concerned 

Apart  from  privacy  concerns,  many  other  Acts  of  Parliament 
set  stringent  conditions  on  access  lo  their  records  Some  Acts 
stale  specifically  which  officers  or  which  other  Departments 
can  have  access  to  their  data  Most  of  these  access  restrictions 
for  administrative  purposes  are  presently  also  used  to  deny 
access  for  research  purposes  and  in  some  instances  even  to 
deny  access  to  archivists  when  the  latter  wish  to  appraise  the 
historical  or  research  value  of  records. 

Required  Legal  Environment 

Before  outlining  some  of  the  basic  legal  principles  which 
Parliament  should  embrace  through  legislation,  the  research 
community  would  do  well  to  take  note  of  a  comment  in  the 
decision  of  the  U.S.  Supreme  Court  in  the  "Kissinger  Case." 
(5)  The  Justices  quoting  the  Senate  Report  to  the  Federal 
Records  Act  of  1950  highlighted  that: 

"It  is  well  to  emphasize  that  records  come  into 
existence,  or  should  do  so,  not  in  order  to  fill  filing 
cabinets  or  occupy  fioor  space,  or  even  to  satisfy 
archival  needs  of  this  and  future  generations,  [one 
would  assume  that  this  means  the  needs  of  the 
research  community]  but  first  of  all  to  serve  the 
administrative  and  executive  purposes  of  the 
organization  that  creates  them  There  is  a  danger  of 
this  simple,  self-evident  fact  being  lost  for  lack  of 
emphasis, ,,"  (6) 
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The  stalemenl  applies  universally  to  administrative  records  of 
all  organizations,  including  those  in  the  public  sector 
Changes  in  the  legal  environmeni  which  would  assist  in 
meeting  the  needs  of  the  research  community  must  at  the  least 
be  compatible  with  administrative  requirements  Fortunately. 
a  legal  environment  which  stresses  efficient  and  cost- 
effective  management  of  recorded  information  will  also,  as 
we  shall  see,  encourage  the  making  available  of 
administrative  data  for  research  purposes. 

The  legal  environment  required  lo  allow  for  access  to 
government  administrative  and  survey  data  should  embody 
the  following  concepts; 

•  that  no  government  record  may  be  destroyed  or  altered  in 
any  form  without  proper  authority; 

•  that  the  government  be  able  to  identify,  inventory,  and 
describe  all  their  information  holdings; 

•  that  the  government  be  able  to  identify  and  know  the 
information  contents  of  all  its  data  collection  activities; 

•  that  institutions  of  government  be  permitted  to  disclose 
government  records  under  their  control  lo  any  scholar  or 
research  institution  for  research  purposes;  such  records  to 
be  in  both  identifiable  and  anonymized  microdala  and  in 
original  and  complete  form  without  legal  barrier  or 
statutory  exemptions. 

•  that  the  government  accepts  the  principle  that  the  security 
sensitivity  of  classiTied  records  declines  with  the  passage 
of  time  and  that  declassification  schemes  are  established 
and  implemented. 

•  that  for  all  government  records  a  "life-cycle"  is 
established  and  adhered  to  0  e  .  records  retention  and 
disposal  schedules). 

•  that  the  research  community  apart  from  their  rights  as 
individuals  under  privacy  and  as  individuals  or  groups 
under  freedom  of  information  be  granted  the  right  to 
advise  the  government  in  its  decision-making  process 
pertaining  to  the  disposal  (i.e.  destruction  or  permanent 
retention)  of  government  records;  and 

•  that  the  Dominion  Archivist  can  effectively  declare  as 
archival  any  government  record  (or  a  copy),  to  be  kept 
permanently  for  historical  or  research  purposes. 

On  the  researcher's  side  a  comparable  legal  environment 
must  also  be  created.  Such  an  environment  would  be  based  on 
the  principles  of  ethics  in  research  which  inter  alia  would  also 
include  the  nght  of  privacy  and  protection  of  the  individual 
where  personal  information  is  involved  Researchers  might 
want  to  consult  government  records  that  are  not  normally 
accessible  to  the  public.  One  would  expect  that  when  access 
10  such  information  is  granted  for  legitimate  research 
purposes,  it  would  be  granted  only: 

•  if  the  research  subject  has  consented  lo  the  intended  use. 


if  the  vested  interests  of  the  research  subject  are  not 
harmed  or  involved  because  of  the  type  of  information, 
general  public  knowledge  of  Ihe  information,  or  the  type 
of  data  processing  involved;  or 

if  the  researcher  agrees  in  writing  not  lo  reveal,  publish  or 
otherwise  disclose  information  which  would  make  it 
possible  to  identify  any  individual  person,  business,  or 
organization  except  "\"  years  after  Ihe  birth  of  an 
individual,  or  when  the  individual  is  dead  or  ".\"  years 
after  death,  or  ".X"  years  after  Ihe  taking  of  a  census  or 
survey,  or  "X"  years  after  the  receipt  of  information 
relating  to  a  business,  institution  or  organization  outside 
of  govemmenl- 


The  reason  for  the  "X"  years  is  thai  these  are  policy 
decisions  for  the  govemmcnt  lo  make,  hopefully  in 
consultation  with  Ihe  research  community.  However,  the 
written  undertaking  or  contract  between  the  researcher  and 
Ihe  government  not  to  reveal  information  in  any  form  thai 
could  reasonably  be  expected  lo  identify  Ihe  research  subject. 
1  e,.  guaranteeing  anonymity,  would  cover  over  91)%  of  Ihe 
research  needs 

Should  an  individual  researcher  not  agree  with  such 
conditions,  one  would  expect  him  to  raise  the  question  of 
public  access  (as  distinguished  from  access  for  research 
purposes  under  specified  conditions)  either  through  the  path 
open  to  him  under  freedom  of  information  legislation  or 
through  the  consultative  channels  between  the  research 
community  and  the  government 

The  above  is  an  outline  of  the  basic  political-legal 
environment  for  government  records  which  would  ensure  that 
both  administrative  and  research  needs  can  be  met  The 
reason  for  the  statements  made  in  this  section  wil  become 
apparent  in  the  model  solution 


A  MODEL  SOLUTION 

Basic  Approach 

The  public  ai  large  benefits  from  legitimate  uses  of 
administrative  data  for  research  and  analysis  by  government, 
businesses,  non-profit  organizations,  academics,  etc.  Such 
data  is  used  to  analyze  the  effectiveness  of  the  delivery  of 
existing  socio-economic  programs;  to  study  the  causes  of 
disease,  poverty,  crime,  or  migration;  and  lo  discern  trends  in 
society  which  may  be  of  interest  lo  Ihe  nation  as  a  whole,  to 
particular  interest  groups,  or  lo  the  individual  researcher.  Yet 
Ihe  public  is  also  concerned  about  Ihe  increasing  burden  of 
providing  the  required  information  ("paperburden")  and 
about  the  real  or  imagined  possibility  of  the  misuse  of  the  daia 
provided  by  them 

Funher.  at  the  moment,  the  legislation  affecting  the  use  of 
adininisirative  data  for  research  purposes  is  far  from  clear  and 
on  the  whole  presents  a  negative  approach    Instead  of  dealing 
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with  all  these  laws  and  regulations  on  a  casc-hy-casc  basis, 
the  model  solution  proposed  here  presents  a  glohal  approach 
which  nevertheless  should  be  applicable  on  the  detailed  level. 
The  success  of  the  approach  taken  here  depends,  however,  on 
the  establishment  a  priori  of  the  following  overall 
requirements  for  government  records 

•  THE  GOVERNMENT  MUST  ESTABLISH  THE 
INFORMATION  IN  ITS  POSSESSION  (PHYSICALLY 
LOCATED  IN  ONE  OF  ITS  INSTITUTIONS)  OVER 
WHICH  rr  HAS  OWNERSHIP  OF  THE  CROWN  IN 
RIGHT  OF  CANADA  AND  THE  CONDITION  OF  SUCH 
OWNERSHIP  This  will  also  be  a  requirement  for 
information  falling  under  FOl,  Privacy  and  Archives 
legislation,  and  is  of  particular  relevance  to  information 
created  as  part  of  joint  ventures  of  the  government,  e.g., 
federal-provincial,  as  well  as  for  information  created  or 
collected  as  a  result  of  the  contracting  out  of  work  using 
public  funds 

•  NO  GOVERNMENT  RECORD  MAY  RE  DESTROYED 
OR  ALTERED  IN  ANY  FORM  WITHOUT  THE  CONSENT 
OF  THE  DOMINION  ARCHIVIST  OR  HIS  DESIGNATE. 
AND  IMPROPER  DESTRUCTION  MUST  CALL  FOR 
AUTOMATIC  PENALTIES  Through  the  establishment  of 
authorized  records  retention  and  disposal  schedules,  the 
Dominion  Archivist  ensures  efficicnl  and  cost-effective 
management  of  government  records,  destroying  those 
records  of  no  further  use  or  value  and  selecting  for 
permanent  retention  those  records  having  historical  or 
research  value 

•  THE  DOMINION  ARCHIVIST  MUST  HAVE  THE  RIGHT 
TO  DECLARE  ANY  GOVERNMENT  RECORD  OR  A 
COPY  THEREOF  AN  ARCHIVAL  RECORD  AND 
WHERE  A  COPY  IS  CONCERNED  TAKE  EARLY 
POSSESSION  WHERE  EFFICIENCY  OR  THE  NATURE 
OF  THE  RECORD  DICTATES  The  greater  proportion  of 
administrative  data  are  created  as  a  result  of  ongoing 
programs  Such  data  ac  generically  known  as  case  Hies, 
i.e..  nies  containing  records  related  to  speciHc  repealable 
actions,  events,  persons,  organizations,  products, 
objects,  etc  and  which  are  usually  filed  and  retrieved  by 
name,  number  or  any  other  systematic  identifier  For  the 
sake  of  administrative  efficiency  and  timely  delivery  of 
programs  the  government  has  in  recent  years  resorted  to 
technologies  which  help  it  reduce  the  "paper-shuffling." 
Information  received  from  individuals  in  paper  form  is 
microfilmed,  coded  and  maintained  in  higher  density 
storage  media  A  long  letter  from  a  program  participant  is 
reduced  to  a  single  computer  change-order  slip;  old 
addresses  and  records  of  those  no  longer  participating 
are.  in  due  course,  deleted  automatically  Normally  when 
a  file  such  as  that  of  a  Canada  Pension  Plan  beneficiary 
finally  reaches  the  archives,  all  that  might  be  found  inside 
the  file  jacket  are  a  few  computer  change  orders,  the 
current  address  and  the  current  benefits.  Information  on 
that  individual's  fifty  or  sixty  years  of  interaction  with 
government  will  not  be  there  Since  modern  storage 
technologies  lend  themselves  quite  readily  to  copies,  the 
historical  and  research  interests  would  be  met  if 
procedures  were   instituted   so  that  at   least   a  complete 


microdala  sample  of  such  administrative  data  would  be 
deposited  in  a  continuous  fashion  with  the  Public 
Archives. 

In  return  for  authorizing  the  alteration  of  records  from  one 
storage  medium  to  another,  eg,  paper  to  microfilm  or 
machine  readable  form,  the  Dominion  Archivist  would 
retjuire  a  sample  to  be  forwarded  to  the  archives.  Sampling 
schemes  adopted  for  such  series  of  case  files  or  data  banks 
should; 

•  be  applicable  to  the  paper,  micrographic  and  machine 
readable  components  of  a  data  bank; 

•  be  consistent  with  the  technology  used  for  processing  the 
information  of  that  data  bank; 

•  be  cost-effective  and  administratively  implementable, 
given  the  human  and  financial  resources  at  hand; 

•  be  statistically  acceptable, 

•  allow  for  longitudinal  studies,  ie  ,  be  periodic  and 
consistent; 

•  allow  for  comparative  studies,  i.e.,  be  integrated  with 
other  data  banks  so  as  to  create  as  high  a  probability  as 
possible  in  capturing  information  on  the  same  data 
subject  from  different  data  banks 

The  last  point  touches  on  the  problem  of  record  linkage 
Record  linkage  is  the  process  whereby  data  from  different 
record  systems  are  linked  on  a  case-by-case  basis,  on 
grounds  that  any  of  the  single  record  systems  is 
incomplete  either  with  respect  to  data  or  required 
coverage  or  both  Record  linkage  is  used  to 
"reconstitute"  large  portions  of  (past)  populations  or  to 
consider  "statistically"  any  large  population  or  sample  in 
a  multivariate  context. 

For  most  large  data  banks  the  basic  file  series  is  organized 
by  unique  identifiers  such  as  the  Social  Insurance 
Number,  corporate  taxation  number,  or  like  unique 
systematic  numerical  or  symbolic  identifiers.  Derivative 
file  series,  e.g..  frauds  and  persecutions,  appeals,  are 
organized  either  as  part  of  a  mam  file  series  with  colour 
coding  or  as  a  separate  file  series,  often  in  alphabetical 
sequence  Other  ca,se  files  whose  existence  is  the  result  of 
voluntary  rather  than  mandatory  participation  are  often 
organized  alphabetically  within  some  subject,  geographic 
region,  or  industrial  or  product  coding  schemes. 

The  approach  of  the  archivist  has  always  been  that  of 
sampling.  It  is  estimated  that  on  the  whole  the  Public 
Archives  presently  selects  for  permanent  retention  less 
than  59r  of  all  the  paper  records.  Case  files  can  be 
selected  for  permanent  retention  on  the  basis  of  any  of  the 
following  criteria; 

•  a  case  file  may  be  important  for  the  issues  involved,  due 
to  the  issue  itself  or  the  context  in  which  it  occurred; 
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•  a  case  file  may  be  regarded  as  important  for  its  mfiuerice 
in  the  development  of  principles,  precedents,  or  standards 
of  judgement  in  such  matters  as  the  definition  of  the 
jurisdiction,  operations,  and  mandate  of  the  department 
concerned; 

•  a  case  file  may  be  regarded  as  important  for  its 
contribution  to  the  development  of  methods  and 
procedures; 

•  a  case  file  may  be  regarded  as  important  for  its 
documentary/illustrative  value; 

•  a  case  file  may  be  regarded  as  important  for  its  research 
value,  which  even  if  minimal  for  one  case  file  would  be 
high  enough  to  warrant  permanent  retention  if  a  sufficient 
number  of  case  files  were  selected 

Basically,  the  archivist  uses  two  general  kinds  of  sampling 
techniques,  "judgement  sampling"  and  "probability 
sampling  "  In  both  cases  the  archivist  selects  for  permanent 
retention  a  collection  of  records  on  only  a  few  members  of  the 
universe.  A  more  common  term  for  judgement  sampling  is 
selective  retention,  while  probability  sampling  or  just 
"sampling"  is  a  technical  term  for  a  procedure  whereby  one 
selects  a  number  or  "sample"  of  items  from  a  defined 
"population"  of  items  in  such  a  way  that  every  item  in  the 
population  has  a  known  chance  of  being  selected.  For 
example,  if  one  decided  to  sample  by  terminal  digit  5  of  the 
Social  Insurance  Number  one  would  know  that  the 
probability  of  any  individual  being  in  the  sample  would  be  1 
out  of  10  or  lU9f  Should  one  instead  decide  to  sample  first 
letter  of  ti.e  surname,  for  example  by  the  letter  "R",  the 
probability  would  be  1  in  2U  or  59t.  The  exact  details  of 
sample  scheme  for  administrative  and  survey  data  will  be 
presented  at  a  later  date.  (7) 


If  one  can  now  assume  that  the  Dominion  Archivist  has 
acquired  administative  data  banks  in  loio  or  in  sample  form, 
these  are  some  of  the  basic  parameters  that  might  be  applied 
to  the  release  of  microdata  under  controlled  conditions: 

•  there  must  be  a  legitimate  and  important  research  purpose 
to  be  served  by  the  process; 

•  the  researcher  must  sign  a  written  contract  specifying  the 
degree  of  detail  below  which  information  taken  from  the 
microdata  may  not  be  disclosed; 

•  al  time  of  the  conclusion  of  the  research  project  or  no 
later  than  some  specified  date,  the  researcher,  if  granted 
temporary  pos.session  of  microdata,  must  either  return 
such  data  to  the  Public  Archives  or  submit  an  affidavit  as 
to  Its  destruction; 

•  there  must  be  a  prohibition  against  dissemination  of  the 
microdata  to  a  third  party  without  the  written 
authorization  of  the  Dominion  Archivist; 

•  the  researcher  must  submit  a  copy  of  the  publication 
containing  the  data  derived  from  the  microdata  to  the 
Public  Archives  and  in  some  cases  prior  to  publication; 

•  significant  and  mandatory  sanctions  or  penalties  for 
improper  disclosure  of  microdata  would  have  to  be 
founded  in  law; 

•  the  researcher  must  have  available  an  ombudsman 
mechanism  to  deal  with  confiicls  relating  to  the  terms  of  a 
research  contract  before  it  is  signed; 

where  the  researcher  is  allowed  to  take  the  microdata  to 
his  own  facilities  for  processing  and  analysis,  the 
facilities  should  provide  a  level  of  security  comensurate 
with  that  required  for  the  microdata 

A  Sample  Application — The  Federal  Government 

In  terms  of  actual  application,  we  can  consider  the  following 
(hypothetical)  example  for  a  socio-economic  program  (SEP) 
The  program  is  about  to  shift  from  a  totally  paper  mode  to  a 
multi-storage  media  approach,  i.e  ,  paper,  microform  and 
EDP. 


The  program  consists  of  five  different  levels  of  file  series,  namely: 

Title  Organization 

Benefits  &  Claims  By  terminal  SIN  digit 

Appeals  ditto,  colour  coded 

Frauds  &  Persecutions  alphabetic 

Judicial  Decisions  alphabetic 

Medical  Files  By  terminal  SIN  data 


N  of  Participants 

22,()00,1X)0 

1,I()(),(KK) 

:20,()0() 

44.0(KI 

2.(XK).()IK) 
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The  adminisiralors  of  the  program  propose  to  destroy  all 
paper  records  for  Tile  series  1,2  and  5  upon  receipt, 
microfilming  the  "relevant"  portions  instead  They  further 
propose  to  destroy  all  files  for  file  series  I  and  2  two  years 
after  last  action,  and  file  series  i  and  5  five  years  after  last 
action,  while  maintaining  only  summaries  of  4,  the  judicial 
decisions 

After  some  deliberation,  the  Dominion  Archivist  approves 
the  records  retention  and  disposal  schedule  subject  to  the 
following  limitations: 


:  administrators  are  lo  deposit 


For  file  series  I  the  programn 
at  the  Public  Archives, 

•  a    01%  sample  of  all  paper  records  consisting  of  those 
participants  whose  SIN  number  ends  in  5555, 

•  a     17c  sample  of  all  records  microfilmed  consisting  of 
those  participants  whose  SIN  number  ends  m  555, 

•  a   lO'/f  sample  of  all  EDP  records  consisting  of  those 
participants  whose  SIN  number  ends  in  5. 

For  file  series  2,  the  samples  for(a)  would  be  1%,  eg  all  red 
colour  coded  file  jackets  where  the  SIN  number  ends  in  55 
For  (b)  and  (c),  no  separate  samples  would  be  taken  as  it  is 
assumed  that  for  (c)  there  would  be  a  code  for  "appeals"  in 
the  EDP  record 

For  file  series  3,  a  SVr  sample  would  be  taken  of  all  those 
whose  first  letter  of  the  surname  starts  wiih  an  "R  "  (Note: 
10%  of  the  "R's"  would  also  be  found  in  the  level  1 ,  EDP 
samples   This  would  ensure  high  linkage  probability  ) 


For    file    series 
components. 


4,    the    sample    would    consist    of    two 


•  those  case  files  for  which  the  judicial  decision  was  of 
special  significance  because  of  person  involved,  nature  of 
case,  precedents,  etc., 

•  a  10%  sample  consisting  of  the  lener  "R"  (4.94%)  and 
the  letters  "A"  (2.92%)  and  "N"  (1.69%).  While  the 
letter  "B"  for  example  would  give  a  10%  sample  directly 
such  a  sample  would  not  be  "linkable"  to  file  senes  .1 
One  should  therefore  approach  alphabetic  sampling  in 
terms  of  "common  building  blocks." 

In  our  hypothetical  example,  the  department  agrees  with  this 
sampling  plan,  knowing  that  the  Public  Archives  will  abide 
by  the  access  and  disclosure  provisions  of  the  legislation 
pertaining  to  this  data.  Some  of  the  data  can  be  made  readily 
available  in  anonymized  form,  eg.,  the  EDP  portion,  and  the 
remainder  will  become  available  at  a  later  date.  This  assures  a 
respect  for  the  totality  of  the  "fonds  " 

Now  this  is  a  single  case.  If  the  same  approach  were  applied 
to  cover  all  the  data  banks,  a  national  archival  sample  would 
be  created  The  matter  of  a  national  archival  sampling 
strategy  for  administrative  data  would  include  advice  to  the 
Dominion  Archivist  from  such  agencies  as  Statistics  Canada, 


the  national  research  councils,  and  the  research  community. 
In  order  lo  maximize  the  probability  of  record  linkage  for 
longitudinal  and  comparative  studies,  a  national  archival 
sampling  strategy  would  have  to  be  applied  uniformly  lo  all 
administrative  data.  The  immediate  benefit  would  be  a 
reduction  in  survey  costs  and  resulting  paperbucden 

It  may  well  happen  that  a  government  institution  "A"  wishes 
to  use  the  data  collected  by  another  department  "B" 
(administrative  or  survey  data)  for  research  purposes. 
However,  the  legal  questions  involved  may  take  some  time  to 
resolve  Department  "A"  makes  its  case  to  the  Public 
Archives,  which  in  turn  requests  Department  "B"  to  deposit 
a  copy  of  the  data  in  question,  e.g.,  a  machine  readable  data 
set  and  documentation,  in  the  Public  Archives,  which  will 
hold  the  magnetic  tape  until  the  legal  differences  are 
resolved.  Such  a  mechanism  might  be  of  particular  interest  to 
government  departments,  in  thai  ii  provides  for  a  single 
"neutral"  depository  of  administrative  data,  thus  ensuring 
that  longitudinal  series  can  be  created  without  a  sudden  hiatus 
caused  by  some  legal  problems  of  direct  transfer  from  "A"  to 


The  Public  Archives  would  be  the  neutral  repository  of  these 
samples,  which  would  be  subject  to  the  transfer  conditions 
and  prevailing  legislation.  When  demand  from  the  research 
community  warrants,  "public  use  files"  would  be  prepared. 
In  other  instances,  a  research  file  would  be  created  from  the 
master  file.  The  Public  Archives  may  even  decide  to  split  the 
master  file  into  two  separate  components,  removing  all 
unique  identifiers  lo  a  XREF  file  and  substituting  a  control 
number  instead 

The  research  community  would  peruse  the  federal 
government  inventory  of  data  banks  (required  under  FOI  and 
privacy  legislation)  and  the  national  archival  sample  to 
determine  whether  the  administrative  data  being  archived 
(and  the  variables  contained  therein)  met  their  research  needs 
If  not,  a  researcher  or  a  research  project  could  make  its 
wishes  known  to  the  Dominion  Archivist  for  the  permanent 
retention  of  certain  administrative  data,  even  though  legal  and 
financial  problems  pertaining  to  access  for  research  purposes 
may  take  some  time  lo  resolve. 

The  ombudsman  mechanism  could  be  fulfilled  through  the 
creation  of  a  special  committee  of  the  already  existing 
advisory  council  on  public  records  consisting  of 
representatives  of  the  learned  societies  and  like  representative 
bodies. 


References: 

1 )  Federal    Inventory    Annual    Report,    March    1980, 

prepared  by  Federal  Inventory  Data  Base  Group, 
Statistics  Canada.  For  further  information  contact  Mr. 
Dave  Sally,  Federal  Data  Bank  Manager,  Statistics 
Canada,  Ottawa,  Canada,  K  I  A  0T6. 

2)  As  cited  in  J    Knoppers,  "A  Freedom  of  Information  Act 

and  the  Future  of  Social  Science  Research",  Social 
Sciences  in  Canada.  7  (December  1979)  4:9-10 
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3)  At  present  the  Canadian  Historical  Association  and  the 
Canadian  Pohtical  Science  Association  represent  the 
research  community  on  the  Advisory  Council  on  Public 
Records.  The  Public  Archives  is  giving  serious 
consideration  to  widening  the  diversity  of  representation 
of  the  research  community 

4)  The  full  text  of  Part  I V  of  the  Canadian  Human  Rights  Act 

can  be  found  as  an  appendix  to  the  1980  Index  of  Federal 
Information  Banks  which  is  available  for  reference  in 
every  Post  Office  and  Canada  Manpower  Office  across 
Canada. 

5)  Kissinger  vs  Reporters  Committee  for  Freedom  of  the 
Press  et  al..  Supreme  Court  of  the  United  States.  March 
3.  1980.  No  78-1088  and  78-1217.  The  case  involved 
summary  or  verbatim  transcripts  of  Kissinger's 
conversation    notes    and    telephone    notes    which    he 


"unlawfully  removed"  from  the  State  Department  and  al 
a  later  date  deposited  al  the  Library  of  Congress  with 
severe  access  restriction.  In  short,  the  Supreme  Court 
ruled  that  the  plaintiffs  had  "no  standing"  and  that  only 
the  National  Archives  and  Records  Service  and/or  the 
State  Department  could  pursue  Kissinger  for  return  of  the 
notes. 

6)  Senate  Report  to  the  Federal  Records  Act  of  1950.  S   Rep 

No.  2140.  81st  Congress.  2nd  Session,  at  4<I950). 

7)  The  distribution  of  SIN  numbers  by  first  letter  of  surnames 

for  R  is  4  94%.  The  author  is  currently  undertaking  a 
project  to  establish  an  overall  mixed  numeric-alphabetic 
sampling  scheme  for  case  files  falling  under  privacy 
legislation.  This  includes  a  study  on  the  distribution  of 
first  letters  of  surnames  of  language,  ethnic  and 
provincial  groupings 


THE  CONDUCT  OF  USER  SURVEYS 


Dennis  D.  McDonald,  Ph.D. 

King  Research,  Inc 
Washington.  DC 


I'd  like  to  start  by  making  a  few  statements  in  general  about 
user  surveys,  and  then  to  discuss  their  goals,  their  different 
varieties,  and  finally  some  methodological  pointers 

First,  there's  no  such  thing  as  an  "end  user" 

Second,  information  isn't  a  commodity  like  a  bar  of  soap 
which  can  be  bought,  sold,  and  priced  "over  the  counter." 

Third,  don't  believe  that  users  and  potential  users  can  tell  you 
pointblank  what  they  really  want  and  need  in  the  way  of 
information  services. 

Finally,  the  more  you  know  about  a  user  before  you  conduct  a 
user  survey,  the  better  off  you'll  be 


User  survey  goals  can  be  classified  into  the  following  five 
categorie:s: 

*     Input  prior  to  system  development. 

You  may  want  to  find  out  the  needs  of  a  potential  user 
group  before  you  invest  a  lot  of  dollars  in  the 
development  of  systems  or  services.  This  kind  of  user 
survey  is  difficult  to  conduct  since  what  people  suy  they 


may  need  and  what  they  actually  end  up  using  may  be 
two  entirely  different  things. 

Information  about  your  competition. 

You  may  be  in  the  process  of  developing  a  system  or 
service,  and  you  may  want  to  assure  yourself  that  related 
information  products  and  services,  i  e.,  your 
"competition."  are  not  satisfying  the  same  needs.  This 
kind  of  survey  is  essentially  a  form  of  intelligence 
gathering,  and  conducting  it  will  force  you  to  consider 
how  your  product  or  service  will  supplement  what  is 
already  available 

Identity  of  current  users. 

You  may  have  a  system  in  operation,  and  you  may  want 
to  find  out  not  only  how  and  why  people  are  using  your 
system,  but  whether  they  are  satisfied  with  what  your 
system  is  supplying  This  is  what  most  people  consider  to 
be  a  "user  survey." 

Potential  users. 

You  may  have  a  system  which  is  operating  and  satisfying 
the  needs  of  a  certain  core  group  of  users  But  you  want 
to  expand  the  system's  use  You  must  then  seriously 
think   about  potential   user  populations  and  how  to  ask 
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them  aboul  their  own  needs  and  what  information  sources 
they  currently  use  I  recommend,  however,  that  you 
conduct  a  survey  of  potential  users  either  after  or 
concurrenl  with  your  survey  of  current  users.  Otherwise, 
you  may  never  understand  the  context  within  which  your 
current  users  operate  and  this  will  hamper  your  ability  to 
promote  your  system  to  new  users. 

•     Evaluation. 

You  may  want  to  evaluate  a  system  by  means  of  a  user 
survey  In  fact,  surveys  are  often  conducted  as  an  input  to 
evaluating  many  kinds  of  systems  and  services,  in 
addition  to  those  devoted  specifically  to  information 
services  But  it  is  wise  to  remember  that  a  survey  is  not. 
in  and  of  itself,  an  evaluation  method;  it  is  simply  a  tool 
of  evaluation,  and  the  actual  evaluation  depends  upon 
how  the  results  of  the  user  survey  are  interpreted. 

Each  of  these  five  goals  is  related  to  the  different  kinds  of 
user  survey  you  might  conduct  And  accomplishment  of  these 
goals  is  also  related  to  the  four  points  I  made  earlier 

1  mentioned,  for  example,  that  "there  is  no  such  thing  as  an 
end  user."  The  term  "end  user"  implies  a  certain  finality,  as 
if  the  information  your  system  supplies  converges  at  one 
point  and  "sits  there,"  like  the  "stop"  symbol  in  a 
flowchart  If  you  think  aboul  the  people  who  have  access  to 
your  products  and  services,  you  probably  hope  that  the 
information  you  supply  do  sn't  stop  with  them  Ultimately, 
you  want  it  to  affect  their  thinking  and  behavior  One  of  their 
potential  behaviors  is  sharing  the  information  you  supply  with 
others,  periiaps  in  a  substantially  modified  form  (This  can 
also  feed  back  into  increased  use  of  your  services  through 
"word  of  mouth"  advertising  )  So  when  you  survey  your 
users  you  want  to  know  not  only  what  they  do  with  the 
Information  they  obtain;  you  should  also  find  out  how  ihey 
retransmit  this  information,  and  to  whom  they  retransmit  it 
(This  is  a  practical  way  of  identifying  potential  users.) 

I  also  mentioned  that  information  isn't  a  commodity,  like  a 
bar  of  soap  This  also  affects  the  goals  you  can  accomplish 
through  a  user  survey  For  example,  two  of  the  most 
important  questions  you  will  ask  when  developing  or 
modifying  your  system  are  "How  much  are  people  going  to 
use  it?"  and  "How  much  are  Ihey  willing  to  pay  to  use  il?"  If 
you  are  developing  a  database  which  will  be  accessed  online, 
you  may  need  to  address  yourself  to  two  groups:  the  people 
who  sit  ai  the  terminal  and  interact  with  the  system,  and  the 
people  who  actually  use  the  output  of  the  online  interaction; 
these  two  groups  are  not  always  the  same  The  kinds  of 
questions  you  ask  these  two  groups  may  be  substantially 
different  The  first  group  you  will  want  to  ask  aboul  their 
interaction  with  the  system  and  how  they  relate  to  their  users 
The  second  group,  the  actual  users,  you'll  want  to  ask  what 
Ihey  do  with  the  information  Ihey  obtain  from  the  system  If 
both  of  these  groups  have  no  experience  with  the  type  of 
system  by  which  your  database  will  be  accessed,  asking  them 
hypothetical  questions  about  amount  of  use  and  willingness  to 
pay  may  very  well  be  counlerproduclive.  You  would  do  much 
better  to  ask  them  about  the  types  of  decisions  and  problems 
they  have  to  face  on  a  day-to-day  basis.  Since  information  is 


only  one  of  the  resources  people  use  to  solve  problems  and 
make  decisions,  a  detailed  understanding  of  these  problems 
—  what  some  people  call  the  "situational  context" — will  be 
useful. 

KINDS  OF  SURVEYS 

Given  the  relationship  between  the  user  survey  goals  and  the 
types  of  questions  which  you  mighl  focus  upon  in  your  user 
survey,  several  different  kinds  of  surveys  begin  to  emerge 

The  first  kind  of  user  survey  is  the  one  which  concentrates  on 
the  user's  inleraciion  with  the  system.  Here  the  types  of 
questions  concern  such  things  as; 

•  Frequency  of  access 

•  Type  of  access  channel 

•  Payment  for  use 

•  The  nature  of  the  access  system 

•  The  physical  medium  used  to  supply  the  information 
■  The  ease  with  which  the  system  can  be  accessed 

•  Reasons  for  satisfaction  and  frustration  with  the  system 


The  second  kind  of  user  survey  concentrates  on  the 
information  which  is  supplied  by  the  system.  Here  the 
relevant  questions  are: 

•  The  type  of  problems  or  decisions  which  are  important  to 
the  user 

•  The  role  information  plays  in  solving  these  problems 

•  The   relative   use  of  different   information   sources   for 
solving  these  problems 

•  The     perceived     accuracy     and     reliability     of    these 
information  sources 


In  the  real  world,  it's  impossible  in  a  user  survey  to 
completely  separate  reactions  to  ( 1 )  the  contents  of  a  database 
from  (2)  the  system  used  to  obtain  access  to  the  contents, 
since  the  two  are  so  often  inlenwined  The  perfect  example  of 
this  IS  the  system  which  allows  the  user  to  manipulate  a 
database  using  a  very  flexible,  personalized  command 
language  Essentially,  the  user  is  generating  a  unique  product 
each  time  he  or  she  sits  down  in  front  of  the  terminal.  In  this 
case,  the  actual  contents  of  the  database  may  take  secondary 
importance  to  the  way  in  which  the  user  interacts  with  the 
system,  since  il  may  be  the  interaction  method  which  makes 
the  system  unique 


METHODOLOGICAL  POINTERS 

Now,  I'd  like  to  discuss  some  methodological  points  related 
to  user  surveys.  These  deal  with  three  main  issue  areas: 

•  How  the  survey  should  be  conducted 

•  How  the  sample  is  designed 

•  How  the  questionnaire  is  designed 
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One  assumption  I  make  is  thai  a  survey,  by  definition, 
involves  the  collection  of  a  standardized  set  of  data  elements 
fixim  a  group  of  individuals  A  further  assumption  is  thai  data 
collection  may  involve  an  interaction  with  geographically- 
dispersed  individual  users,  instead  of,  for  example,  direct 
observation  of  their  behavior 

Concerning  sample  design,  by  far  the  most  important  issue  is 
how  best  to  define  and  identify  a  population  of  users  from 
which  to  sample.  Sample  size  is  not,  as  may  be  commonly 
believed,  the  major  issue.  The  major  issue  is  ensuring  that 
you  contact  the  right  people,  the  people  who  are  qualified  to 
answer  the  questions  you  have.  The  more  time  and  effort  you 
devote  to  sample  development,  the  more  will  be  your  reward. 
Some  of  the  practical  implications  of  this  view  are  the 
following 

First,  while  it  is  cheaper  to  purchase  a  list  of  individuals' 
names  and  addresses  from  a  commercial  list  supplier  or 
professional  association,  you  may  be  better  off  if  you  develop 
your  own  list,  perhaps  over  time,  of  people  who  have  used  or 
who  have  expressed  an  interest  in  your  system  (This,  of 
course,  will  not  be  sufficient  for  studying  all  potential  users  ) 

Second,  try  to  obtain  complete  information  on  the  type  of 
organization  which  employs  your  system's  potential  user. 
Prefer  the  devlopment  of  lists  with  business  address,  rather 
than  home  address.  That  way  you  can  subdivide  or  stratify 
your  sample  by  employer  type  before  you  conduct  your 
survey. 

As  a  general  rule,  try  to  settle  upon  the  specific  population 
whose  ise  you  will  be  studying  before  you  draft  your 
questionnaire.  That  way  you  can  be  sure  that  the  questions 
you  want  to  ask  are  appropriate  for  your  sample,  and  vice 
versa 

Never  assume  that  your  users  are  thoroughly  familiar  with 
your  information  center  or  system;  always  describe  in  detail 
the  system  or  service  whose  use  you  are  studying. 
Remember,  your  system  is  only  one  of  the  information 
sources  your  user  relies  upon. 


Be  prepared  to  define  and  explain  your  questionnaire  terms  in 
great  detail  This  is  especially  true  when  you  are  dealing  with 
information  products  and  services.  Terms  such  as 
"information,"  "access,"  and  "use"  must  be  defined 
concretely  if  you  are  to  have  confidence  in  your  analysis.  It's 
because  of  the  need  to  provide  extensive  definitions  that  I  am 
partial  to  self-administered  mail  questionnaires,  since  they 
allow  the  researcher  to  provide  supfKirting  definitions  and 
instructions  while  being  cheaper  than  personal  Interviews 

Be  sure  to  pretest  your  questionnaire  with  respondents  who 
are  similar  to  the  ultimate  survey  respondents  You'll  be 
surprised  how  much  pretest  respondents  may  misinterpret 
your  questions  and  response  categories 

Be  sure  to  involve  your  programmer  in  questionnaire  design. 
There  is  nothing  more  frustrating  than  finding  you  can't 
analyze  a  question  in  a  particular  way  since  the  coding  and 
editing  were  done  afier  the  questionnaire  was  filled  out. 

Devote  resources  to  following  up  and  increasing  your 
response  rate.  If  it  comes  to  a  tradeoff  between  spending  time 
and  money  on  increasing  the  initial  sample  size,  and  spending 
time  and  money  to  follow  up  your  initial  mail-out  sample,  go 
with  the  latter. 

Try  to  avoid  designing  questionnaires  by  committee,  and  set  a 
definite  limit  on  the  number  of  questionnaire  revisions  you 
can  tolerate  Remember  that  no  questionnaire  is  perfect,  since 
no  two  individuals  are  alike  and  they  will  want  to  make 
exceptions  no  matter  how  detailed  your  planning  is. 

Finally,  always  allow  respondents  in  your  survey  the 
opportunity  to  express  themselves  in  their  own  words,  even  if 
the  questionnaire  is  substantially  pre-coded;  you  will  be 
surprised  how  valuable  individual  handwritten  comments  can 
be,  especially  when  thay  suggest  potential  uses  and  benefits 
of  your  system  which  you  may  have  overlooked. 


CONCLUSION 


Be  sure  to  allow  your  user  in  the  survey  to  give  some 
indication  of  frequency  of  use.  You  will  want  to  divide  your 
analysis  into  "frequent  users"  and  "infrequent  users";  their 
preferences  may  differ  significantly 

Approach  "willingness  to  pay"  questions  with  great  caution 
Many  individuals  are  not  accustomed  to  paying  directly  for 
information  services.  The  person  whom  you  want  to  ask 
about  payment  questions  may  be  the  business  officer  or 
administrator  with  sign-off  responsibilities,  not  the  assistant 
researcher  who  will  not  be  paying  you  directly. 


The  most  important  secret  to  conducting  a  good  user  survey 
cannot  be  described  in  methodological  terms  Excellent 
sampling,  proper  questionnaire  wording,  mistake-free  survey 
processing,  and  sophisticated  analysis  cannot  overcome  an 
incomplete  or  limited  understanding  of  the  product  or  service 
whose  use  you  are  measuring.  Be  sure  to  ask  yourself  ivM' 
y(W  are  conducting  the  survey.  If  you  seriously  try  to  answer 
this  question  before  you  start  out  you  will  go  a  long  way  to 
making  sure  that  your  survey  helps  you  answer  your  most 
important  questions. 
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PRIVATE  AND  PUBLIC  SECTOR 

RESPONSIBILITY  FOR  THE 

COLLECTION,  DISTRIBUTION  AND  ANALYSIS 

OF  STATISTICAL  DATA 


Joseph  E.  Kasputys 

Data  Resources  Inc 
Lexington,  Mass. 


Statistical  information  is  a  vital  national  resource 
Quantitative  data  on  who  we  are  and  what  we  do  influence 
countless  decisions  in  all  parts  of  the  hedera!.  state  and  local 
governments,  in  commercial  and  investment  banking,  in 
manufacturing,  in  service  and  retail  trade,  and  in  various 
nonprofit  enterprises  such  as  hospitals,  schools  and 
foundations.  Such  data  also  play  a  role  in  our  personal  lives, 
shaping  education  and  career  choices,  personal  finance  and 
investment  strategy,  regional  preferences  and  other  lifetime 
decisions.  The  collection  and  distribution  of  statistics  is  an 
essential  role  of  government 

Like  most  resources,  statistical  data  is  indeed  a  limited 
resource.  The  American  public,  in  general,  and  business,  in 
particular,  as  evidenced  in  the  mounting  concern  over 
paperwork  burdens,  have  a  finite  capacity  to  respond  to  the 
ever-growing  demands  for  information  from  the  government 
It  simply  costs  too  much  to  respond  to  the  total  sum  of 
information  requirements  arising  from  legislation,  regulation, 
and  other  individually  worthy  program  requirements.  Once 
the  data  are  collected,  this  proliferation  of  information  can 
have  the  effect  of  turning  "more  into  less"  by  making  it 
extremely  difficult  lo  select  relevant  data  from  a  myriad  of 
conflicting  sources,  definitions,  and  time  periods  At  the 
same  time  that  both  the  capacity  to  produce  and  the  capacity 
to  utilize  statistical  data  have  approached  the  limits  of 
reasonableness,  we  have  had  the  accompanying  orthogonal 
concerns  over  the  right  of  the  public  to  have  access  to  data 
collected  by  the  government  and  the  right  of  the  individual 
person  or  business  to  receive  appropriate  protection  from 
invasions  of  privacy 

For  these  reasons,  a  more  comprehensive  effort  does  need  to 
be  made  lo  review  and  control  Federal  statistical  policy 
Current  programs  conducted  by  the  Office  of  Federal 
Statistical  Policy  and  Standards  are  certainly  helpful  to  assure 
that  maximum  use  is  made  of  the  existintg  statistical  system 
and  that  new  requirements  are  appropriately  designed  to 
impose  a  minimum  burden  on  the  reporting  public  while 
meeting,  to  whatever  extent  possible,  any  unfulfilled  needs 
for  data  that  affect  a  broader  user  community.  In  my  own 
view,  these  current  programs  are  not  sufficient.  Further,  it  is 
unlikely  that  any  measures  will  be  truly  sufficient,  including 
proposals  lo  establish  an  Office  of  Federal  Information  Policy 


or  an  independent  Office  of  Statistical  Policy  in  the  Executive 
Office  of  the  President,  until  both  the  Legislative  and 
Executive  Branches  fully  recognize  statistical  data  as  a  scarce 
resource  and  begin  to  treat  it  accordingly.  However,  more 
central  review  and  control  should  contribute  lo  this  realization 
and.  therefore.  I  would  support  some  form  of  increased 
coordination  and  control  by  the  Executive  Office  of  the 
President 

In  considering  what  types  of  control  may  be  appropriate,  it  is 
helpful  lo  refiect  on  why  the  Federal  government  collects 
statistical  information   I  believe  there  are  four  major  reasons: 

1  Regulation.  Regulation  implies  control,  and  control 
requires  comparison  of  actual  performance  with  a 
standard  This  comparison  requires  measurement,  which 
converts  into  a  need  for  statistical  data.  The  explosion  in 
Federal  regulatory  activity  has.  in  turn,  generated 
enormous  data  requirements.  Legislators  have  also 
learned  that  statistical  data  options  can  affect  regulatory 
outcomes  As  a  result,  legislation  increasingly  specifies 
the  data  lo  be  used,  which  limits  the  flexibility  that  should 
be  present  in  the  design  of  the  Federal  statistical  system. 

2  Program  Operation.  Many  programs  require  data  in  order 

to  operate  at  all  Major  examples  include  revenue 
sharing,  local  public  works,  and  similar  grant  programs 
lied  lo  specific  formulas  Other  programs,  while  not 
operated  purely  on  a  statistical  basis,  require  data  for 
evaluation  of  effectiveness  and  possible  modification.  As 
with  regulation,  those  who  design  programs  in  the 
Executive  Branch  and  enact  them  into  law  in  the 
Legislative  Branch  have  learned  that  the  data  used  do 
have  a  material  impact  on  program  results.  This  Is,  of 
course,  true  with  formula  programs,  where  endless 
varieties  of  variables  are  tested  in  alternative  formulas 
until  one  is  found  that  provides  the  program  designer  with 
a  distribution  of  funds  that  is  intuitively  acceptable. 
Again,  the  needs  of  the  Federji  s;.n,stical  system  are 
secondary  in  this  process,  which  .nsicad  encourages  the 
development  of  increasing.'  amounts  of  data  collection  to 
provide  statistics  that  are  uniquely  appropriate  for  each 
program 
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-''.  Policy  Analysis.  Slatislical  daia  musi  be  colleclcd  lo 
provide  information  lo  policymakers  in  Ihe  Executive  and 
Legislative  Branches  on  economic  and  financial 
conditions  that  exist  nationally  and  internationally  Data 
on  the  national  income  accounts,  production,  prices, 
population,  trade,  and  similar  items  fall  into  this 
category-  It  is  probably  the  area  that  requires  the  most 
difficult  decisions  and  one  in  which  an  office  charged 
vnith  statistical  policy  can  have  the  greatest  influence  If 
the  data  needs  of  society  could  be  correctly  anticipated, 
and  if  proper  discipline  were  exercised  m  program 
design,  legislation,  and  regulation,  practically  all  Ihe 
statistical  information  needed  for  regulation  and  program 
operation  would  be  regularly  available  through  the 
thoughtful  and  coordinated  development  of  Ihe  statistical 
system.  While  this  may  be  an  ideal  that  can  never  by 
reached,  it  is  a  worthwhile  goal  to  keep  working  toward 

4.  Information  Programs.  Certain  government  programs 
exist  for  the  principal  purpose  of  keeping  the  public 
informed  The  intent  may  vary  from  stimulating 
technological  innovation  tha>ugh  the  diffusion  of 
knowledge  to  reducing  market  imperfections  through 
improved  information 

What  statistical  policy  control  and  coordination  will  be 
effective  given  these  sources  of  data  requirements'  For  the 
first  two  sources,  regulation  and  program  support,  it  appears 
clear  that  statistical  policy  must  be  effectively  incorporated 
into  legislative  proposals  and  regulatory  actions  This  can  be 
best  done  from  the  perspective  of  the  Executive  Office  of  the 
President  Hopefully,  with  growing  public  concern  over  the 
paperworl.  and  reporting  burden,  the  Legislative  Branch  will 
become  more  sensitive  to  the  need  to  make  reference  to 
statistical  policy  and  precepts  before  enacting  legislation  and 
make  increasing  use  of  the  capabilities  of  the  proposed 
organization  In  the  third  area,  statistical  data  for  policy 
analysis,  I  believe  the  existing  highly  decentralized  Federal 
statistical  system  requires  the  improvement  that  can  only  be 
obtained  through  an  organization  thai  has  broad  perspective 
over  Federal  activities  and  sufficient  clout  to  make  a 
difference  This  again  argues  for  a  stronger  central  role  from 
a  high  level. 

In  the  fourth  area,  government  programs  to  provide 
information  to  the  public  are  more  controversial.  Prior  to  the 
extensive  development  of  publishing,  media,  and  information 
industries,  the  government  indeed  fulfilled  an  important  need 
by  supplying  information  facilitating  commerce  and  industry 
that  would  not  otherwise  be  available.  However,  in  1480,  we 
find  a  highly  developed  information  industry,  utilizing  the 
latest  technology,  that  has  a  strong  capability  to  identify 
information  needs,  collect  data,  deliver  results  frequently 
tailored  to  the  needs  of  specific  clients,  and  even  provide 
special  analysis  on  the  meaning  of  information  for  business 
decisions  In  lieu  of  unilateral  government  determinations  on 
the  nature  of  private  sector  information  requirements,  market 
demand  and  the  profit  motive  can  now  be  used  more 
extensively  to  govern  statistical  collection.  Equally 
important,  the  reporting  public  becomes  free  to  chose 
whether  lo  respond  to  information  requircmcnls  generated  by 
the  private  sector,  which  can  be  expccled  lo  be  in  propomon 


10  the  perceived  value  of  the  information  The  efforts  of  a 
strengthened  central  statistical  policy  office  should  focus  on 
transfemng  data  collction  and  distribution  of  this  nature  to  the 
private  sector,  while  improving  and  rationalizing  statistical 
operations  in  support  of  regulation,  program  operation  and 
policy  analysis. 

I  would  like  to  encourage  any  new  statistical  office  to  remain 
.separate  from  other  aspects  of  information  policy  Statistical 
policy  is  sufficiently  unique  and  important  to  deserve  separate 
attention.  To  be  sure,  statistical  policy  must  be  established 
with  an  awareness  of  telecommunications,  ADP 
management,  pnvacy,  and  related  concerns;  but  should  not 
be  merged  with  these  other  functions  which  are  more 
involved  with  regulatory  concerns.  Incidentally,  with  the 
reduced  costs  of  computers  and  communications  and  rapidly 
rising  costs  of  personnel,  the  emphasis  on  strong  central 
control  over  the  acquisition  and  use  of  ADP  and  related 
services  is  probably  misplaced.  More  emphasis  should  be 
given  to  Ihe  improvement  of  productivity  through  the  use  of 
these  capabilities  Ihan  to  elaborate  restrictions  on 
procurement 

Given  that  the  Federal  govemmenl  does,  and  must,  collect 
statistics,  the  government  bears  a  major  responsibility  to 
make  this  information  available  lo  the  public.  Divergent 
interpretations  of  the  term  "'making  information  available  lo 
the  public"  are  possible  Interpretations  of  this  term  can 
range  from  placing  information  in  a  public  reading  room 
somewh.  re  in  Washington.  D.C..  such  as  the  SEC  public 
reading  room,  lo  Ihe  Federal  government's  placing  all 
available  statistics  in  a  shared  computer  and  aggressively 
marketing  these  statistics  to  private  sector  organizations  To 
the  extent  that  controversy  exists  over  the  interpretation  of  Ihe 
Federal  role  in  information  dissemination,  1  believe  thai  much 
of  It  can  be  traced  to  two  conflicting  principles.  On  the  one 
hand,  we  wish  the  Federal  government  lo  make  available  all 
information  that  is  collected  to  the  public,  except  where 
individual  or  business  rights  to  proprietary  and  confidential 
information  would  be  violated  On  the  other  hand,  an  equally 
important  precept  in  our  society  is  that  the  Federal 
government  should  not  encroach  on  activities  that  properly 
belong  in  the  private  sector  It  is  fortunate  for  our  economic 
health  that  the  Federal  government  has  avoided  enlering  into 
legitimate  private  sector  enterprises  where  not  needed 
because,  despite  this  restraint  on  private  sector 
encroachment,  we  have  only  seen  the  role  of  the  government 
grow  bigger  over  time 

It  is  clear  that  an  extensive  information  industry  has 
developed  in  the  United  States  to  take  full  advantage  of 
available  communications  and  computer  technology  Some 
testimony  to  the  existence  of  this  industry  is  an  article  in  the 
May  .S.  198U  edition  of  FORTUNE  (which  incidentally  is  the 
25th  anniversary  edition  of  the  FORTUNE  500),  entitled 
"Everything  You  Always  Wanted  to  Know  May  Soon  beOn- 
Line"  This  article  explores  Ihe  rapid  development  of  Ihe  on- 
line information  industry,  which  includes  both  bibliographic 
and  statistical  databases.  The  on-line  database  industry 
includes  firms  that  specialize  in  organizing  and  delivering  just 
one  or  a  few  databases,  along  with  those  that  provide  almost 
encyclopedic    coverage    of   all    available    quantitative    data 
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Many  ol  these  firnii  indeed  carry  dala  that  arc  collecled  and 
published  by  ihc  Federal  govemmeni.  Since  ihcy  are  involved 
in  assessing  market  needs  and  matching  these  needs  to  dala 
availability,  it  is  likely  that  most  of  the  markets  for  statistical 
data,  including  data  provicfcd  by  the  Federal  government,  that 
can  be  economically  served  either  have  been  or  will  be 
identified  Products  will  be  developed  based  upon  available 
data  by  the  private  sector  information  Industry  to  serve  the 
needs  of  these  markets 

It  IS  important  to  emphasize  that  I  know  of  no  one  in  the 
industry  who  does  not  believe  that  the  government  should 
have  access  to  all  available  technology  to  use  in  the 
dissemination  of  its  data  Our  principal  concern  is  rather 
whether  the  government  is  competing  with  already  existing 
private  sector  activities,  which  include  computers, 
telecommunications  and  software  that  have  been  developed  at 
considerable  pnvate  expense  One  consideration  may  be 
whether  Ihc  existing  industry  efforts  adequately  serve  all  the 
markets  for  information  that  require  such  service  There  is 
little  concern  over  whether  the  FORTUNE  51HJ  companies, 
major  banks,  or  large  research  organizations  are  being 
adequately  served  with  data  Questions  have  been  raised  from 
lime  to  lime  over  whether  the  needs  of  the  small  organization 
or  the  individual  researcher  in  the  university  are  being 
adequately  met,  and  whether  the  government  should  not  step 
in.  using  the  latest  available  technology,  to  meet  these  needs. 
This  is  clearly  a  matter  for  public  policy  to  decide  If  such 
needs  are  not  now'  being  met  by  the  pnvate  sector,  it  is  likely 
that  they  cannot  be  met  economically  on  a  full  cost  recovery 
basis.  At  such  time  as  the  technology  would  permit  such 
needs  to  be  met  economically.  I  am  certain  that  the  private 
sector  would  step  in.  Since  it  is  unlikely  that  the  govemmeni 
could  perform  these  services  any  more  economically  than  the 
pnvate  sector,  we  are  then  faced  with  a  decision  as  to  whether 
to  subsidize  information  dissemination  to  these  specialized 
markets.  It  is  my  own  belief  that  in  most  cases,  if  the 
government  makes  information  available  through 
depositories  and  through  statistical  publications  and  reports, 
the  small  business  or  the  individual  researcher  with  an 
occasional  need  for  information  will  be  able  to  gain  adequate 
access  to  the  data  and  reports  he  or  she  seeks 

I  would  also  encourage  the  government,  as  I  mentioned 
earlier,  to  make  full  use  of  computer  and  communications 
technology  in  its  own  inlemal  dissemination  and  utilization  of 
statistical  data.  The  government  has  been  an  advanced  user  in 
this  regard  All  private  citizens  would  hope  this  will  continue. 
for  it  should  lead  to  the  more  effective  use  of  statistical  data 
for  public  policy  development  It  is  axiomatic  thai  data 
always  receive  less  analysis  than  they  deserve.  New 
technologies  available  through  the  computer  and 
communications  networks  should  permit  higher  levels  of 
analysis  to  be  conducted,  leading  to  better  decisions.  The 
only  caveat  to  be  considered  is  the  classical  "make  or  buy" 
decision  that  is  observed  within  government,  which  centers 
around  whether  to  develop  an  inhouse  capability  for  the 
storage,  distnbulion  and  analysis  of  such  data  or  whether  to 
obtain  It  from  pnvate  contractors  Here  again.  I  believe  there 
is  ample  evidence  to  support  the  view  that  the  pnvate  sector  is 
considerably  advanced  in  the  distribution  and  analysis  of 
quantitative  data  and  has  much  to  offer  the  government  in  this 


regard . 

In  order  to  gauge  the  adequacy  of  the  on-line  database 
information  industry,  it  might  be  useful  to  provide  a  few 
additional  words  of  clariTication  on  industry  operations  DRI 
serves  as  a  good  example  of  this  industry  and  indeed  was 
cited  in  the  FORTUNE  article  mentioned  earlier  as  "probably 
the  preeminent  company  in  the  on-line  information  field." 
DRI  has  spent  many  millions  of  dollars  over  the  past  1 3  years 
in  collecting,  organizing  and  documenting  data  Only  a  small 
portion  of  these  data  originate  in  the  Federal  government. 
While  DRI  does  maintain  data  on  the  national  income 
accounts,  prices,  population,  housing  starts  and  the  like, 
much  of  DRl's  data  originates  elsewhere,  from  such  agencies 
as  the  OECD.  IMF.  World  Bank,  foreign  governments;  and 
from  daily  stock  prices,  commodity  quantities  and  prices, 
market  research  surveys,  and  material  published  privately  by 
trade  associations  A  reprsentative  listing  of  DRI  databases  is 
attached  in  an  Appendix  to  this  paper  DRl's  current  on-line 
storage  capability  for  immediate  access  material  is  in  the 
neighborhood  of  36  billion  characters  and  is  scheduled  to 
grow  to  55  billion  characters  by  the  end  of  this  year.  We 
believe  we  have  the  largest  collection  of  economic  and 
financial  information  available  anywhere.  Many  of  these 
databases  were  originally  issued  in  cross-sectional  format 
DRI  has  taken  many  periods  of  cross-sectional  data  and 
converted  them  into  consistent  time  series.  This  often 
involves  dealing  with  definitional  changes  from  one  time 
period  to  the  next,  changes  in  the  data  collection  approach,  or 
changes  in  the  basic  entities  measured  Once  organized  in  a 
consistent  time  series  formal,  the  dala  are  described  both  in 
on-line  documentation  and  in  reference  manuals  Mnemonics 
are  assigned  for  easy  use  in  referencing  through  software. 
DRI  has  developed  software  that  permit  any  and  all  on-line 
data  to  be  retrieved  and  manipulated  with  operations  varying 
frotn  seasonal  adjustment  to  all  forms  of  regression,  to 
forming  equations,  to  building  models,  to  simulating  the 
models,  to  producing  reports  and  graphs  for  effective 
presentation  of  results  More  importantly,  it  has  been  DRl's 
finding  that  on-line  databases  can  only  be  efficiently  and 
economically  offered  to  markets  with  a  good  understanding 
of  the  applications  to  which  the  data  will  be  put  Indeed,  part 
of  our  product  offerings  include  applications  methodologies 
and  consulting  assistance  that  enable  users  to  combine  their 
own  data  on  sales,  production  costs,  and  competition  with  the 
generally  available  data  provided  by  DRI. 

Questions  have  also  been  raised  on  the  appropnate  role  of  the 
Federal  versus  the  pnvate  sector  in  performing  analysis  of 
statistical  data  It  is  my  belief  that  the  Federal  statistical 
agencies  should  focus  their  analytical  efforts  on  developing 
descriptive  information  from  data  that  are  collected.  This 
involves  organizing  data  to  present  it  effectively,  so  that  the 
users  of  the  data  can  fully  understand  what  the  statistics 
mean  This  is  a  critical  role  which  will  often  stimulate  policy 
action,  since  proper  presentation  will  of  itself  indicate  that 
such  action  is  needed.  Any  analysis  that  goes  beyond  display 
or  rearrangement  of  factual  information  should  be  limited  to 
political  appointees  and  policymakers,  together  with  their 
supporting  staffs,  who  should  be  kept  separate  from  the 
statistical  agencies  and  the  information  which  they  release 
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This  latter  type  of  analysis  would  include  forecasts  and 
interpretations  of  the  underlying  meaning  of  statistical 
information.  It  is  important  that  analysis  of  this  sort  be 
released  separately  fropi  the  factual  statistical  information 
provided  hy  Federal  agencies  so  that  the  public  can  readily 
tell  fact  from  opinion 

Private  sector  firms,  of  course,  should  engage  in  any  and  all 
analysis  appropriate  for  the  markets  being  served  Indeed, 
this  provides  a  plurality  of  information  to  the  general  public 
The  information  industry  has  an  important  role  to  play  in 
providing  alternative  interpretations  of  statistical  data  to 
businesses  and  industry,  which  can  then  be  used  to  form 
opinions  on  the  validity  of  the  analysis  being  provided 
through  the  political  system.  This  plurality  of  information, 
together  with  multiple  sources  of  statistical  information 
delivery,  is  a  healthy  aspect  of  our  society  and  is  essential  to 
maintaining  an  informed  public  which  can  reach  independent 


judgments  on  major  economic  and  financial  issues. 

The  development  of  our  national  statistical  system  continues 
to  be  a  matter  of  great  importance  to  all  economic  and 
financial  activity  The  proper  development  of  this  system 
goes  beyond  narrow  economic  and  financial  concerns,  and 
has  direct  infiuencc  in  maintaining  the  principles  of  our 
democracy  The  development  and  use  of  statistics  should  be  a 
mutual  undertaking,  shared  by  both  the  Federal  and  private 
sectors.  1  believe  these  sectors  have  worked  together  very 
cooperatively  in  the  past  to  give  the  United  States  clearly  the 
best  statistical  data  available  anywhere  in  the  world  We  now 
face  new  challenges  in  how  we  can  work  together  to  fully 
utilize  the  advances  that  technology  has  given  us  I  believe 
that  continuing  dialogues  among  all  concerned  will  contribute 
materially  to  our  ability  to  face  these  new  challenges  and 
meet  the  needs  of  the  general  public  and  our  private  clients  in 
the  most  effective  manner  possible. 


APPENDIX:  A  GUIDE  TO  DRI  DATA  BASES 


Age-Income  Model 

Agriculture  and  Weather 

Automotive 

Best  Executive  Data 

California 

Canada 

Canadian  Model  Data  Bank 

Chemical  Data  Banks 

Coal  Data  Bank 

Coal  Model  Data  Bank 

Commodities  Market  Data  Bank 

Compuslat 

The  Conference  Board 

Consumer  Expenditure  Survey 

Flow  of  Funds 

Current  Population  Survey 

Plan's  Data  Bank 

Developing  Countries  Primary 

Source  Data  Banks 

PRO  FORMA  Data  Bank 

DRIFACS 

SITE  II 

DRI-SEC 

Standard  &  Poor's  Industrial  Financial  Data 

Drilling  Data  Banks 

State  and  Area  Forecasting  Service  Data  Bank 

Energy 

Steel  Data  Banks 


Forestry  and  Wood  Service  Data  Banks 

IBRD's  Worid  Debt  Tables  Data  Bank 

IMF's  Balance  of  Payments 

IMF's  International  Financial  Statistics 

Industry  Financial  Service 

Insurance  Service  Data  Bank 

International  Energy  Data 

International  Trade  Information  Service  Data 

Japan 

New  York  City  Model  Data  Bank 

OECD  Main  Economic  Indicators 

OECD  National  Income  Acounts 

OECD  Trade  Series  A 

Cost  Forecasting  Service  Data  Banks 

Paper  and  Pulp  Data  Banks 

European  Model  Data  Bank 

Target  Group  Index 

European  National  Source 

Data  Bank 

Transportation 

FDIC  Data 

U.S.  Central 

FIEI 

US    Macro  Model  Data  Bank 

U.S.  Prices 

U.S.  Regional 

US    Weekly  Banking 

Value  Line 
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NEWS  and  NOTES 


WEST  COAST  IASSIST  SEMINARS  GOING  STRONG 


The  US  West  Coasi  contingent  of  lASSIST  met  once  again 
on  July  24ih  at  the  Rand  Corporation  in  Santa  Monica  This 
was  the  third  meeting  held  on  the  West  Coa-st.  and  there  are 
indications  that  interest  in  these  seminars  is  growing 

The  July  24th  meeting  was  attended  by  25  representatives 
from  research  centers,  universities,  and  libraries  located  at 
Berkeley.  Davis.  Los  Angeles.  Riverside.  Dominguez  Hills. 
and  San  Diego.  California 

After  a  report  on  the  Annual  Conference  of  I  ASSIST  given  by 
Jackie  McGee.  the  agenda  continued  with  a  presentation  by 
Don  Trees,  the  Rand  Corporation  Don  gave  his  analysis  of 
the  responses  to  an  lASSIST  interest  questionnaire  (see  this 
issue].  The  survey  had  been  conducted  to  determine  future 
plans  for  this  dynamic  group  As  a  result  of  the  survey,  it  is 
expected  seminars  and  workshops  will  continue  to  be  held, 
alternating  ;5etween  Southern  California  and  Northern 
California  two  or  three  times  a  year  Copies  of  the  analysis 
will  be  mailed  to  those  persons  who  were  mailed 
questionnaires. 

A  second  presentation  given  by  llona  Einowski.  State  Data 
Program.  U  C  Berkeley,  described  for  the  group  her 
problems  and  the  results  of  her  efforts  to  merge  the  data 
holdings  of  two  separate  collections  into  one  data  center. 

Libbie  Stephenson,  ISSR.  University  of  California  Los 
Angeles,  commented  on  the  similarities  and  disparities  she 
has  noted  between  the  traditional  library  and  a  data  library 
Libbie  stressed  the  need  for  more  communication  between  the 
two  groups.  Later.  Jackie  McGee.  the  Rand  Corporation, 
illustrated  how  the  Data  Facility  Resource  Guide  was 
formatted.  The  emphasis  of  this  presentation  was  on  the 
indices  included  in  the  guide 

Each  presentation  included  exchanges  of  ideas  from  those 
attending  A  final  roundtable  discussion  included  the  plans 
for  the  next  meeting  to  take  place  sometime  in  early 
November  at  a  Northern  California  location 

November  Presentations 

At  the  November  17th.  1980  meeting  of  lASSIST 
participants  on  the  West  Coast,  a  number  of  interesting 
presentations  were  made  The  meeting  was  attended  by 
persons  affiliated  with  academic  and  research  institutions  at 


Stanford,  U.C  Berkley.  U  C.  Los  Angeles,  Cal  State 
Northridge,  Cal  State  Dominguez  Hills,  and  U.C.  Davis  as 
well  as  the  Rand  Corporation,  Lockheed,  luas  Angeles 
County  Public  Library  System,  and  the  Lawrence  Berkeley 
Laboratory  The  meeting  was  hosted  by  Ms  Karen 
Witienborg,  Reference  Librarian  and  Social  Science 
Bibiographcr  at  Stanford  University  in  the  Cecil  H.  Green 
Library 

After  opening  remarks  and  introductions  by  Ms  Wittenborg, 
the  first  presentation  was  given  by  Ms.  Jackie  McGee  of  the 
Rand  Corporation  Ms  McGee  spoke  on  her  experiences  on 
using  the  program  package  SAS  (Statistical  Analysis  System) 
and  the  utilities  available  for  archive  management  As  an 
example,  a  SAS  program  is  used  to  generate  a  major  variable 
index  of  specific  data  files.  SAS  also  affords  considerable 
versatility  in  providing  records  of  data  usage  for  statistical 
purposes 

Next  on  the  agenda  was  a  fascinating  and  intriguing  talk 
given  by  John  McCarthy  about  SEEDIS  (Socio-Eiconomic 
Environmental  Demographic  Information  System)  which  is 
being  developed  at  Lawrence  Berkeley  Laboratory  SEEDIS 
represents  an  effort  to  collect  data  on  air  quality,  energy 
planning,  employment  development,  environmental  impacts, 
epidemiology,  land  use.  and  1980  Census  data  into  an 
accessible  data  base.  The  effort  has  been  supported  by  the 
Department  of  Energy.  Department  of  Labor.  Bureau  of  the 
Census.  Environmental  Protection  Agency,  and  other 
government  agencies  The  system  uses  a  VAX  computer 
based  network  with  nodes  presently  in  Berkeley,  CA, 
Richland,  WA,  San  Francisco,  CA,  Seattle,  WA,  and 
Washington,  DC.  Eventually  there  will  be  nodes  situated  in 
each  of  the  twelve  regional  offices  of  the  Department  of 
l^bor  Through  a  variety  of  modules,  the  system  can  provide 
a  number  of  statistical  and  geographic  types  of  information 
and  can  generate  an  array  of  charts,  graphs,  tables,  maps,  and 
plots  as  well  as  printed  reports 

The  rest  of  the  day  was  devoted  to  the  subject  of  the  1980 
Census  Jackie  McGee  gave  a  presentation  highlighting  the 
CENSPAC  software  package  as  contrasted  with  the  programs 
which  manipulate  the  DUALabs  format  of  census  data 
Elizabeth  Stephenson  of  UCLA  spoke  on  reference  services 
with  census  data  as  conducted  at  UCLA  The  program  has 
been    successful   due    to   a   good   cooperative    relationship 
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bclween  the  University  Research  Library,  the  Social  Science 
Data  Archive,  and  the  computer  center.  Consultants, 
librarians,  and  other  staff  members  utilize  a  variety  of  printed 
and  machine-readable  information  sources  in  providing 
assistance,  and  take  into  consideration  the  needs  and 
computer  sophistication  of  each  researcher. 

Ilona  Einowski,  (UC  Berkeley)  presented  information  on  the 
developing  Sacramento  State  Census  Data  Center  and  the  role 
and  function  of  the  regional  offices  which  will  disseminate 
census  information.  The  regional  centers  will  be  able  to 
provide  a  number  of  products  and  services  as  well  as  tape 
processing.  Printed  information  will  be  kept  on  file  and  there 
will  be  a  staff  available  for  program  consultation  and  for 


locating  desired  information.  The  regional  center  at  Berkeley 
will  primarily  serve  the  academic  community  in  California. 

Finally,  the  meeting  closed  after  considerable  discussion  on 
the  regional  lASSIST  conference  to  be  held  in  May  of  1981 
Topics  10  be  included  as  well  as  the  format  in  which  they 
should  be  presented  were  noted  Personnel  arc  needed  who 
are  willing  and  able  to  spend  lime  with  planning  and 
organizing  For  further  information  on  this  or  other  plans  of 
lASSIST  on  the  West  Coast  contact:  Jackie  McGee.  The 
Rand  Corporation,  (21.1)  .iy.V041l  exi  7.151;  Elizabeth 
Stephenson.  Institute  for  Social  Science  Research.  UCLA, 
(21.1)  825-0711  ext.241;  or  Ilona  Einowski.  University  of 
California  at  Berkeley.  (415)  642-6571. 


ORGANIZATIONAL  IMPLICATIONS  FROM  THE 
WEST  COAST  LOCAL  SEMINAR  SURVEY 


Over  a  year  ago.  West  Coast  lASSIST  members  staned 
getting  together  to  discuss  common  problems.  These  local 
get-togethers  eventually  evolved  into  organized  local 
seminars  on  issues  of  concern  to  the  staff  of  machine 
readable  data  archives  and  their  users  As  the  presentations 
became  more  structured,  non  lASSIST  individuals  from  local 
government,  universities  and  private  organizations  began  to 
attend  the  meetings,  and  a  mailing  list  was  subsequently 
generated  for  contact  purposes 

In  April  of  1980  the  mailing  list  consisted  of  72  names  These 
names  had  been  collected  from  the  West  Coast  lASSIST 
membership  list.  non-lASSlST  seminar  attendees,  and 
individuals  whose  interest  was  suggested  by  those  attending 
the  seminars 

In  an  effort  to  refine  the  local  lASSIST  seminar  list  to 
represent  only  interested  individuals  and  to  belter  focus  the 
seminar  discussions  on  topics  of  concern  to  participants,  a 
survey  of  the  existing  mailing  list  was  conducted  The  survey 
addressed  three  issues: 

•  Determining    who    was    pariicipaling    or    interested    in 
participating  in  the  local  lASSIST  seminar  program 

•  Selecting  how  participants  would  like  the  seminars  to  be 
structured 

•  Defining  topics  which  were  of  interest  to  participants 

The  names  of  non-respondents  are  being  re-evaluated  as  to 
their  potential  interest  in  the  local  seminar  activities. 

Respondents 

A    total    of    72    questionnaires    were    mailed.    The    overall 


response  rale  was  607^  Of  the  4.1  responses  to  the  survey, 
three  did  not  return  questionnaires  and  three  relumed 
incomplete  questionnaires  The  analysis  of  respondent 
characteristics  and  interests  is  therefore  based  on  .17 
questionnaires  or  i\%  of  the  original  mailing  list  Those  17 
individuals  returning  useable  questionnaires  are  now 
considered  the  core  of  the  West  Coast  lASSIST  seminar 
program. 

Respondents  were  generally  from  academic  libraries  or 
research  centers  and  arc  currently  members  of  lASSIST. 
Respondents'  work  organizations  were  classified  primarily  as 
academic  (75%)  with  non-profit  corporations  (14%)  and 
commercial  (11%)  next.  Their  organization's  primary 
function  was  considered  as  best  being  described  as  library 
(.15%).  research  center  or  group  (.10%).  archive  or  data  bank 
(19%).  computing  or  data  processing  center  (X%-).  research 
project  (.1"7r),  or  other  (5%). 

Thirty-eight  percent  were  lASSIST  members  in  1979.  while 
51%  are  currently  IASSI.ST  members 

Seminar  Structure 

The  responses  to  questions  concerning  seminar  structure  are 
being  used  to  plan  future  meetings. 

Respondents  preferred  that  the  local  group  should  meet  two 
or  three  limes  a  year  and  focus  on  workshops  (14%)  and 
outside  speakers  (2H%).  Member  presentations  (18%^)  and 
roundtable  discussions  (14%)  were  also  mentioned.  The 
current  plan  is  to  schedule  a  workshop,  outside  speaker,  or 
member  presentation  for  each  session  and  follow  with 
roundtable  discussions. 
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Several  roundlable  discussions  can  occur  simullaneously 
One  discussion  can  focus  on  Ihe  lopic  of  the  workshop, 
speaker,  or  member  pre^senlation,  and  (he  others  represent 
special  interest  groups  or  current  topics  of  interest  to 
members  Several  roundtables  will  be  planned  in  advance, 
but  the  number  and  content  would  always  remain  flexible 

Another  question  involving  the  organization  of  the  seminars 
indicates  that  respondents  preferred  a  management 
orientation  (57%)  to  topics,  as  opposed  to  a  technical  (34%) 
or  policy  (14%)  orientation 

Previous  seminars  have  been  well  attended,  with  the  number 
of  participants  ranging  from  10-25  Forty-six  percent  of 
those  responding  indicated  that  they  had  attended  a  local 
lASSlST  meeting.  72%  planned  to  attend  future  seminars. 
and  28%  suggested  that  they  might  attend  No  less  than  57% 
of  those  responding  indicated  that  ihey  would  be  willing  to 
host  a  future  seminar 


Interest  Areas 

A  list  of  topics  for  discussion  at  local  or  international 
meetings  was  presented,  and  respondents  were  asked  to 
indicate  their  level  of  interest  in  each  RespondenI  interest  is 
indexed  below  in  two  ways:  the  percent  of  respondents 
"interested"  or  "very  interested"  in  a  lopic  and  the  percent 
difference  between  those  indicating  "very  interested"  and 
those  indicating  "not  interested  " 

The  two  indexes  must  be  interpreted  differently  and  an 
examination  of  percentages  indicates  that  the  two  indexes  do 
not  necessarily  rank  the  topics  in  the  same  manner.  The 
percentage  difference  between  those  "very  interested"  and 
those  "not  interested"  indicates  the  relative  strength  of  the 
very  interested  and  not  interested  groups  Minus  percentages 
identify  significant  groups  of  not  interested  individuals  when 
compared  to  those  very  interested 


Topics 

Data  organization  and  management 

Data  access  and  retrieval  systems 

User  needs  and  services 

Documentalicn  methods  and  standards 

1 980  Census  data  use 

Data  sources  and  acquisition  strategies 

1 980Census  data  availability 

1 98U  Census  data  software 

Archive  development  and  administration 

Networking  systems 

Subject  area  data  bases 

Criteria  for dataevaluation 

Processing  complex  data  bases 

Functions  of  libraries  versus  data  archives  with 
regards  to  machine-readable  data 

Cataloging  methods  and  standards 

Storage  technologies 

Survey  research  methodology 

Methods  of  secondary  data  analysis 


Interested/ 
Very  Interested 

l()0% 

97% 

95% 

91% 

89% 

86% 


81% 
79% 
76% 
75% 

75% 

7.1% 
72% 
71% 
64% 

62% 


Very  Interested/ 
Not  Interested/ 
Difference 

+  .16%' 

+58% 

+  45% 

+  14%. 

+  60% 

+  .V*% 

+  55% 

+  .1% 

+  y% 

-  i% 


-14% 
-14% 
-22% 
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Error  idenlification  and  resolution 
Educating  and  training  archive  staff 
Confidenlialily  and  data  restrictions 
Computer  cartography  and  mapping 


54% 
50% 


Topics  for  which  interest  is  high  will  become  the  subjects  for 
workshops  and  outside  speakers.  Topics  with  smaller 
numbers  of  '"very  interested"  members  will  become 
candidates  for  special  interest  groups  and  roundtables 


IASSIST  Conferences  and  Membership 

Almost  90%  of  the  respondents  Indicated  that  they  would 
attend  an  International  Conference  held  on  the  West  Coast 
Only  three  individuals  from  the  West  Coast  group  were  able 
to  attend  the  1980  International  Conference  in  Washington 
Reasons  for  not  being  able  to  attend  included  no  money 
available  (57%).  conference  site  too  far  (21%).  conflicting 
schedules  (19%),  sessions  not  relevant  (5%).  and  undesirable 
location  (5%). 

Over  76%  of  the  respondents  "strongly  agreed"  that  a 
"Directory  of  Archives  and  Data  Banks  currently  in 
operation"  and  a  "list  of  major  MRDF  data  sources 
according  to  government,  commercial,  non-profit 
corporations  and  academic  catagories"  (65%)  would  greatly 
increase  the  value  of  belonging  to  lASSIST.  Other 
suggestions  for  increasing  the  value  of  lASSIST  membership 
elicited  *"ewer  "strongly  agree"  responses;  an  lASSIST 
Brochure   (28%).   a   Directory   of   Members   (19%).    and   a 


-39% 

Directory  of  Member  Organizations  (19%).  Of  course,  the 
great  majority  of  respondents  (97%)  felt  that  availability  of 
any  of  these  products  would  increase  the  value  of  an  lASSIST 
membership 

This  survey  was  conducted  to  assist  the  West  Coast  lASSIST 
group  in  planning  future  local  seminars.  The  results  are  not 
necessarily  generalizable  to  the  national  membership  or  any 
other  geographic  constituencies.  The  methods  utilized  may. 
however,  have  implications  for  organizing  other  local  area 
groups 

I  believe  that  the  notion  of  topical  areas  could  further  be 
explored  at  the  national  level  to  help  delemiine  the  nature  of 
future  lASSIST  offerings.  However,  I  feel  the  organization's 
interface  role  between  research,  data  processing,  archives  and 
libraries  is  broad  enough  to  encourage  and  support  multiple 
special  interest  groups  at  both  the  local  and  national  level  for 
any  of  the  above  listed  topics. 


Don  Trees 
The  Rand  Corporation 


DATA  CENTER  FOR  HOUSING  RESEARCH  UNDERWAY 


Researchers  interested  in  housing  policy  and  programs  for 
low-income  households  will  be  able  to  access  the  data  files 
from  the  Experimental  Housing  Allowance  Program  (EHAP) 
The  Housing  Research  Data  Center  is  scheduled  to  open  in 
January,  1981.  It  will  be  operated  by  Data  Use  and  Access 
Laboratories  (DUALabs)  under  a  US  Department  of 
Housing  and  Urban  Development  contract  The  first  set  of 
data  files  to  be  made  available  through  the  Data  Center  will 
be  those  from  EHAP 

EHAP  originated  with  the  1970  Housing  Act  Its  purpose  was 
to  test  the  concept  of  housing  allowances  for  low-income 
families.  It  is  the  largest  social  experiment  involving  housing 
assistance  ever  undertaken  Over  .^(),0(K)  households  in  I  2 
areas  of  the  country  were  included  The  EHAP  data  come 
from  a  variety  of  surveys  and  administrative  forms  designed 
to  measure  the  responses  of  families  who  participated  in  the 


programs,  the  effects  of  allowances  on  housing  markets,  and 
the  methods  and  costs  of  administering  housing  allowance 
programs.  Information  about  participants  includes  detailed 
demographic  characteristics,  attitude  profiles,  and 
characteristics  of  their  housing  The  latter  is  based  on  on-site 
inspections  Other  data  sources  include  questionnaires 
administered  to  landlords  and  a  survey  of  non-participating 
households  at  the  EHAP  sites  Longitudinal  analysis,  based 
on  four  survey  waves  and  five  years  of  program  data,  is 
possible  with  some  of  the  files 

HUD's  objective  in  funding  the  Housing  Research  Data 
Center  is  to  increase  the  use  of  data  collected  in  HUD- 
sponsored  research  projects  such  as  EHAP.  The  Data  Center 
IS  being  designed  for  easy  access  and  low  cost  to  researchers 
in  academic,  government,  and  private  organizations. 
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More  Ihan  ICX)  of  the  original  EHAP  data  files  are  being 
combined  and  reformatted  to  simplify  access  and  use  The 
data  and  supporting  documentation  will  be  available  via 
remote  access,  with  training  and  consultation  available  to 
those  who  need  assistance 


Housing  Research  Data  Center 

DUALabs 

1601  North  Kent  Street,  Suite  9(.K) 

Arlington.  Virginia  22209 

aOi)  525-1480 


Persons  interested  in  the  Data  Center  should  contact: 


1980  GSS,  VIRGINIA  SLIMS  POLLS  AVAILABLE 


1980  General  Social  Sarrey 

Data  from  the  1980  General  Social  Survey,  the  eighth  in  the 
series,  are  now  available  through  the  Roper  Center  for  Public 
Opinion  Research  The  1980  survey  maintains  continuity 
with  the  earlier  investigations  and  adds  as  well  interesting 
new  vanables.  including  items  in  which  respondents  were 
asked  to  locate  their  own  preferences  and  the  activities  of  the 
federal  government  on  seven  point  scales  with  regard  to 
reducing  defense  expenditures,  improving  the  socioeconomic 
position  of  blacks  and  other  minorities,  and  reducing  the  level 
of  governmental  services  so  as  to  reduce  governmental 
spending 

The  General  Social  Survey  was  conducted  each  year  from 
1972  through  1978.  and  again  in  the  spring  of  1980.  by  the 
National  Opinion  Research  Center  of  the  University  of 
Chicago  Each  of  these  comprehensive  investigations  of 
opinions,  values,  and  cultural  orientations  contains 
approximately  450  variables  The  sample  size  for  the  1980 
survey  is  1 ,468. 

The  1980  GSS  data  set  is  also  available  as  pan  of  a  combined 
1972-1980  data  file  This  file  has  each  of  the  eight  annual 
surveys  merged  into  a  single  file  for  data  processing  The 
combined  GSS  1972-80  file  can  be  secured  in  either  raw 
data  format  (card  images)  or  as  a  completed  SPSS  system  file 
with  each  year  defined  as  a  subfile  The  SPSS  system  file 
contains  complete  variable  and  value  labeling  for  ease  of  use . 


Virginia  Slims  American  Women's  Opinion  Polls,  1974 
and  1980 

These  studies,  conducted  by  the  Roper  Organization  for 
Virginia  Slims,  investigate  the  lives  of  American  women  in 
the  1970's.  The  polls  cover  such  areas  as  the  move  toward 
equality,  sex  roles  and  stereotypes,  mamage  and  divorce, 
motherhood,  family  relations,  the  world  of  work,  leisure 
activities,  personal  and  financial  concerns  They  highlight 
changes  in  the  activities  and  thinking  of  women  and  changes 
in  the  society  itself  over  the  past  decade — and  foreshadow 
changes  to  come 

Respondents,  age  1 8  and  older,  were  interviewed  face  to  face 
in  their  homes  Interviews  for  the  1974  poll  took  place  in  the 
spring  of  that  year;  those  for  the  1980  poll,  in  late  1979. 


Sample  size: 
Questions: 


2.922  women 
958  men 


2,960  women 
984  men 
90 


The  data  files  are  on  magnetic  tape.  Information  on  format 
and  acquisition  costs  is  available  from: 

User  Services 

The  Roper  Center,  Box  U-164R 

The  University  of  Connecticut 

Storrs,  CT  06268 

Telephone:  (20.1)  486-4440/4882 
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ENCYCLOPEDIA  PUBLISHES  SIDEL  ON  DATA  ARCfflVES 


Philip  S  Sidel  of  lASSIST  authored  ihe  article  "Social 
Science  Data  Archives"  in  the  Encyclopedia  of  Library  and 
Information  Science,  v. 28.  published  in  1980  by  Marcel 
Dekker  Sidel's  15-page  piece  will  be  useful  to  anyone  who 
needs  a  brief  history  and  overview  of  data  archiving  Sections 
include:  What  are  social  science  data  archives?;  historical 
background  of  the  data  archive  movement;  some  major 
archival  organizations  (USA,  Canada.  Western  Europe. 


other  nations;  data  libraries);  U.S.  government  data  archives; 
efforts  at  coordination  and  cooperation;  roles  and  functions  of 
data  archives;  problems  and  issues  (funding  and  finance, 
services,  standardization  and  coordination).  There  is  a 
33-item  bibliography. 

Sidel  is  technical  director.  Social  Science  Computer  Research 
Institute.  University  of  Pittsburg. 


LETTER  TO  THE  EDITOR 


The  Spring  issue  of  the  lASSIST  Newsletter  (vol  4.  no  1) 
carried  an  edilonal  by  Alice  Robbin  titled  "By  the  Seat  of  My 
Pants  "  1  do  feel  that  this  editorial  raises  an  extremely 
important  issue,  one  which  lASSIST  should  give  serious 
attention  to  Alice  raises  the  question  of  preservation  of 
machine-readable  data  files.  She  does  so  in  the  particular 
context  of  asking  the  question  "Who  will  assume  the  long- 
term  responsibility  for  maintaining  an  original/master/ 
archival  copy  of  a  particular  data  Hie  and  its  accompanying 
documentation?" 

lASSlST  would  do  the  research  community  a  service  if  it 
would  devise  a  system  of  notation  for  machine-readable  data 
files  whereby  one  could  ascertain  which  data  archive  has 
taken  upon  itself  the  responsibility  for  maintaining  the  tapes 
and    documentation    pertaining    to   a    particular    file.    One 


solution  would  be  to  add  a  field  in  the  data  file  descriptive 
format  for  this  purpose,  identifying  the  particular  data  archive 
with  this  responsibility  via  a  three  or  four  letter  (digit)  code. 
For  example,  if  the  Steinmetzarchief  has  the  archival 
responsibility  for  a  data  file,  the  notation  in  the  MRDF 
catalogue  would  be  "AR:STM"  or  similarly  for  Alice's  shop 
"ARiDPLS  " 

This  is  just  one  suggestion.  Perhaps  other  members  have 
other  ideas  In  any  case,  this  issue  raised  by  Alice  should  be 
answered  by  lASSIST  without  delay 

Dr   Jake  Knoppers 

Senior  Advisor  (Information  Management) 

Public  Archives  Canada 


CORRECTION 


In  preparing  the  tape  for  lASSIST  Newsletter  4:2,  the 
references  for  Laine  Ruus's  "User  Services  in  a  Data 
Library"  were  i.iadvertently  omitted.  They  are: 

Green,  Samuel  Swett  "Personal  Relations  Between 
Librarians  and  Readers."  Library  Journal  I.  1876, 
pp. 74-81. 

[Inkeles,  Alex)  Project  on  the  Social  and  Cultural  Aspects  of 
Development  Books  1-8.  (Storrs,  Connecticut:  The 
Roper  Center,  n  d  ). 


Nasatir,  David.  Data  Archives  for  the  Social  Sciences: 
Purposes.  Operations,  and  Problems.  (Reports  and 
Papers  in  the  Social  Sciences,  no.  26)  Paris:  Unesco, 
197.V 

Robbin,  Alice  "A  Guide  to  Data  Archive  Organization, 
Management,  and  Servicing  "  (Unpublished  ms.)  Jan 
1977. 

Rothstein,  Samuel  "Reference  Service;  The  New  Dimension 
in  Librarianship  "  College  and  Research  Libraries  22. 
1961,  11-18 
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